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Preface 


There is no surprise that arithmetic properties of integral (‘whole’) numbers 
are controlled by analytic functions of complex variable. At the same time, 
the values of analytic functions themselves happen to be interesting num- 
bers, for which we often seek explicit expressions in terms of other ‘better 
known’ numbers or try to prove that no such exist. This natural symbiosis of 
number theory and analysis is centuries old but keeps enjoying new results, 
ideas and methods. The present book takes a semi-systematic review of an- 
alytic achievements in number theory ranging from classical themes about 
primes, continued fractions, transcendence of 7 and resolution of Hilbert’s 
seventh problem to some recent developments on the irrationality of the val- 
ues of Riemann’s zeta function, sizes of non-cyclotomic algebraic integers 
and applications of hypergeometric functions to integer congruences. Our 
principal goal is to present a variety of different analytic techniques that are 
used in number theory, at a reasonably accessible — almost popular — level, 
so that the materials from this book can suit for teaching a graduate course 
on the topic or for a self-study. Exercises included are of varying difficulty 
and of varying distribution within the book (some chapters get more than 
other); they not only help the reader to consolidate their understanding of 
the material but also suggest directions for further study and investigation. 
Furthermore, the end of each chapter features brief notes about relevant 
developments of the themes discussed. 

Rome was not built in a day. One needs to get comfortable about the 
concept of using analytic tools in number theory, and this serves as a good 
reason for going first through the topics that are traditionally represented in 
books on analytic number theory. This is the principal task of Chapters 2, 4 
and 5, for which complementing (or alternative) sources are existing books 
[4, 9, 21, 33, 36, 47,59]. Chapters 2, 5 and 6 are particularly close to the 
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exposition in [33] (never translated into English); this was my first textbook 
on the subject and a great influence on my personal perception. In addition, 
an inspiration for Chapter 8 came from a series of lectures by Yu. Nesterenko 
on transcendental numbers held in the 1990s, while Chapter 4 is a tribute 
to the late A. van der Poorten and his revolutionary simplistic treatment 
of continued fractions (the book [14] is a recommended complement to the 
chapter). The tone of Chapter 3 is more analytic, parts are rooted from 
the classical Whittaker-Watson textbook [80]. The choice of topics in the 
remaining chapters is guided by my personal tastes. 

My close family (het gezin in Dutch) has been a lasting inspiration for 
my academic career, of which this book is a tiny outcome. Thank you, Olga 
and Victor, for your constant support and encouragement during the years! 

It is my pleasure to thank friends and colleagues whose valuable feedback 
helped to improve the text: Heng Huat Chan, Bjorn Johannesson, Pieter 
Moree, Berend Ringeling, and Armin Straub. The staff and editors at the 
World Scientific assisted me at all stages of transmission of my manuscript 
into the publication, and I am particularly thankful to Rok Ting Tan for 
making this book possible. 

The last but not least thing is to thank the reader who follows the book 
in either physical or electronic format. Enjoy! 


Wadim Zudilin 
Nijmegen (NL) and Newcastle (AU) 
April 2023 
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Chapter 1 


Numbers and g-numbers 


1.1 Prime numbers 


The task of finding the greatest common divisor (a, b) of two integers a and 
b is traditionally performed using the Euclidean algorithm (or its numerous 
extensions). The following statement is a usual companion of the algorithm, 
which we review (in a somewhat friendlier context) in Chapter 4. Our proof 
below is less efficient but works well and is simpler from a theoretical point 
of view. 


Lemma 1.1. Suppose that two positive integers a and b are relative prime, 
(a,b) = 1. Then the (diophantine) equation ax + by = 1 is solvable in 
integers X,Y. 


Proof. Perform mathematical induction on a+ b. The base of induction 
corresponds to a+b = 2, hence a = b = 1, in which case we can take x = 1 
and y = 0 as the solution of ax + by = 1. Assume that the statement is 
true for a+ 6 < r and consider the situation a + b = r. Since (a,b) = 1, 
assuming without loss of generality that a > b we also have (a—b,b) = 1 so 
that (a — b)u+ bv = 1 has a solution u,v € Z because (a — 6) +b =a <r. 
But then the pair + = u, y= v — u solves ax + by = 1. 


Recall that a positive integer p > 2 is said to be prime if all its positive 
integer divisors are exhausted by 1 and p itself. 


Lemma 1.2. If a product of several multiples is divisible by a prime p then 
at least one of the multiples is divisible by p. 


Proof. Without loss of generality consider the case of a product of just two 
multiples a and b. Given ab is divisible by p, we either have p | a (and then 
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the required statement follows) or p is coprime to a. In the latter case, 
(a,p) = 1 so that ax+py = 1 for some x,y € Z by Lemma 1.1. Multiplying 
the both sides of the equality by b we obtain abx + pby = b. The prime 
p divides both the summands on the left-hand side because p | ab (by the 
hypothesis) and p | p for trivial reasons. It follows then that p divides the 
right-hand side, that is, b. 


Lemma 1.3. The least divisor a > 2 of an integer n > 2 is a prime number. 


Proof. Assuming on the contrary that a is not prime, we conclude that is 
possesses a smaller divisor b > 2. Then b | a and a | n implies b | n, so 
that b > 2 is a smaller divisor of n than a. Contradiction that leads to the 
desired conclusion. 


Theorem 1.1 (fundamental theorem of arithmetic). Any integer greater 
than 1 is decomposed into a product of primes, with possible repetitions, 
and this decomposition is unique up to permutation of the primes in the 
product. 


Proof. Let n > 2 be given. Take p; > 1 its least divisor; it is prime by 
Lemma 1.3, and so n = p,n,; for some n; > 1. If ny = 1 then we already 
have the decomposition of n into a product of primes; otherwise, apply the 
procedure to n; > 2 to get ny = pone etc. Since n; < n, this process can 
be officially done through the mathematical induction. 

The uniqueness is performed in a similar fashion, using Lemma 1.2 as 
the principal ingredient. Assume there is an n > 1 for which two representa- 
tions n = p, --- ps and n = q, --- qr exist, where all p; and q; are primes, and 
choose n the least positive integer with this property. It follows from the 
first representation that p; | n so that the product q, --- q, is divisible by py, 
hence at least one of the multiples q; is divisible by p,. By rearranging the 
order we can consider q; divisible by p;. But q; is prime, therefore q, = py. 
Now we cancel the factor pj = q, and denote ny = po---Ps = qo°-:Qr- 
Since n; <n, the decomposition of n, into a product of primes is unique, 
so that r = s and po,...,p, is a rearrangement of qo,...,qs. This implies 
that n = p.--:Ds = Gi -+- Gr are same representations, which contradicts to 


our choice of n. 


A standard way to write the decomposition of a positive integer n into 
the primes is 


N= py ps? + ps, (1.1) 
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where pj,...,Ps are pairwise distinct primes and aj1,...,@, are positive 
exponents. 


Exercise 1.1. Show that all positive divisors of n are exhausted by the list 
{dap erp SO Pi Soi ie = BES ae} 

and compute the total number 7(n) of the divisors and the sum o(n) of the 

divisors. 


The arithmetic functions r(n) and o(n) from the exercise are examples 
of multiplicative functions we will meet in Section 2.4. Another example 
is the Mobius function p(n) defined by p(pit ps? ---p&s) = (-1)* if ay = 
ag = = as = 1 and by 0 otherwise (in other words, when n is not square- 
free, that is, divisible by a square of some integer m > 1); conventionally, 
(1) = 1 as an empty product is always understood as 1 


Exercise 1.2. (a) Prove that 


So y(a) = f ifn = 1, 
d\n 


0 otherwise. 


(b) Prove the Mobius inversion formula: if 
n) => F@) 
din 
then 
= So w(n/d)F( 2 u(d)F(n/d). 
d|n 
In a slightly different arithmetic direction, we can use the explicit struc- 


ture of divisors of natural numbers to control the prime decomposition of 
the factorial n! = 1-2-3---n 


Exercise 1.3. (a) Let 
— II pYP 
pan 


be the canonical decomposition of n! into the product of primes. Show 


that 
vp = ordg(nl) = =| | | | Faber 


(Here || denotes the integer part of a real number 2, that is, |x| < 
x <|a| +1, and the sum terminates since the terms for which p* > n 
contribute trivially.) 


4 Analytic methods in number theory 


(b) Show that the very same vy, can be computed via the formula (n — 
S,(n))/(p — 1), where 5,(n) denotes the sum of digits in n written in 
base p. 


We return to the prime number theme in greater detail in Chapter 2 for 
investigating the asymptotic distribution of primes. 


1.2 Integer-valued factorial ratios 


There are many ways to define the binomial coefficients ae for example, 
as the quantities appearing in the binomial theorem 


(a+d)" = 3 (") ayn (1.2) 


m=0 
or, more pragmatically, via the explicit formula 
n n) 
= ————.,_ wheren>m>0. 
m (n —m)!m! 
It is a fundamental fact that these ratios of factorials are integers. There 
are several ways of demonstrating this. 


Analytical proof. Use the Pascal triangle relations 
Cy te, 
- a 
m m m—1 
and mathematical induction. 


Linear algebra proof. Use the generating function 


a (Tm =a+9r=G49...0+9 Zi, 


—_— 


m=0 , 
n times 


Combinatorial proof. Gs counts the number of m-element subsets of an 
n-set. 


Arithmetic proof. The order in which a prime p enters n! is computed in 
Exercise 1.3. Setting x = (n —m)/p* and y = m/p* in the inequality 


[x+y] —|2] —lyJ 29, 
and summing k over the positive integers, we see that 


ord, (") = ord,(n!) — ord,(m!) — ord,((n — m)!) > 0 for any prime p. 
m 


You may find the last strategy most sophisticated. But it is truly arith- 
metic! 
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Exercise 1.4. The Catalan numbers are officially defined by 


Cs i forn = 0,1,2,.... 
n+1\n 


(a) Prove that 


Cnti= > CiCni forn> 0. 
i=0 


(b) Show that the Catalan numbers are integral for any n > 0. 


2n 


Hint. You may choose neither to notice that C, = (7") — (27, 


part (a). 


) nor to use 


Binomial coefficients are a source of many other integral ratios of fac- 


torials, like 
Tenia eles 


Pa omar a (4n)! 6 ta 


There are however many other cases not reducible to binomials, like 
n! (30n)! 
(6n)! (10)! (15n)! ° 
In 1850 Chebyshev used the integrality of these ratios to give a sharp upper 
bound for the prime counting function. 


and 


(1.3) 


Exercise 1.5. (a) Prove that for n > 0 the number (1.3) cannot be repre- 
sented as a product of binomial coefficients. 
(b) Show that Chebyshev’s numbers (1.3) are integral for any n > 0. 


Hint. (b) Use Exercise 1.3 and the inequality 
f(x) = ([30e] + |x|) — ([6x] + [10a] + [152]) > 0 (1.4) 


valid for all real z. The function f(x) assumes only integral values because 
of the way it is defined; you need to demonstrate that f(x) is either 0 or 
1 for all real x. In order to establish this, make use of y = |y| + {y} 
where {y} is the fractional part of y (which is a 1-periodic function), so 
that f(x) = ({6x} + {10x} + {152}) — ({30x} + {x}) is 1-periodic and the 
required property of x is to be shown for the range 0 < x < 1 only. 
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Chebyshev’s example falls into the category of more general factorial 
ratios 


(ayn)!--+(a,n)! 
D(a, b) = ——_—___—, 1.5 
NS ati) (1-5) 
where the integer vectors a = (a1,...,a,) and b = (b,...,b,) satisfy the 


balancing condition 
r Ss 
Yaad, 
i=1 j=l 


(which we will always assume in what follows) and the arithmetic condition 
r s 

laze - So Lb;2] >0 forzeR. (1.6) 
i=1 j=l 

By the arithmetic proof above or by the arithmetic argument for part (b) 

of Exercise 1.5 we conclude that condition (1.6) is sufficient and necessary 

for the integrality of D,,(a,b). (Yes, indeed, it is necessary as well: if (1.6) 

does not hold then, for many values of n, there are primes that show up in 

the denominator of the corresponding D,,(a,b) with larger exponent than 

in the denominator, because the corresponding analogue of function (1.4) 

assumes negative values for some 0 < x < 1.) This result [52] is known in 

the literature as Landau’s criterion. 

The same argument, based on the inequality 


[2x] + |2y] — [a] — la+y] — ly] 29, 

shows that the so-called super-Catalan numbers, or the Gessel numbers 
(after Gessel [35] who introduced them in 1992 with a combinatorial moti- 
vation) 

(2m)! (2n)! 

Gunn = ———— 

m!(m-+n)!n! 

are integers as well for all m,n > 0. Notice that when m = 1 this is just 
two times the ordinary Catalan number C;,. But as for binomial coefficients 
we can also prove that G,,,,, € Z using other techniques, for example, the 


identity 
min{m,n} 
2n 2m 
Ginn = 1 : ’ 
om OM eta) wry) 


k=— min{m,n} 


due to von Szily (1894). The latter may be challenging for you to verify 
but it is algorithmically provable [61] meaning that there is no need for 
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performing tedious calculations, which are required in the arithmetic argu- 
ment. One can also prove the integrality of G,,,,, by induction using the 
initial conditions Gp» = Gno = ee) and the formula 


[¢/2] f 
= L—2k 
Gripe) POH Vea (1.7) 


k=0 
due to Gessel [35]. The formulae of von Szily and Gessel originate from 
analysis and combinatorics; their analogues for general integer-valued fac- 
torial ratios D,,(a,b) are not known. This makes the arithmetic argument 
quite exclusive. 


1.3. The world of g-numbers 


The material of this section may be considered as advanced to the reader 
whose familiarity with polynomials, their reducibility and their zeros (of- 
ten called roots) in finite extensions of the field of Q is still in a developing 
process. We touch these aspects in greater detail in Section 6.1; but al- 
ready at this point it is useful to single out a particular family of monic 
polynomials — cyclotomic polynomials 
m 
Gua) = | ces: (1.8) 
j=l 
(j,m)=1 

The product in (1.8) is taken over primitive roots of unity Um = {e27/™ : 
j=1,...,m, (j,m) = 1} of degree m, that is, over those a € C for which 
a™ = 1 but a” £1 when 0 < n < m. There are natural bijections (au- 
tomorphisms) of the set U,, onto itself given by a +> a“, where integers 
a satisfy (a,m) = 1; the inverse bijection is given by a +> a? with b such 
that ab = 1 (mod ™m). It is not hard to see that the set G,, of these 
automorphisms has a structure of group, which is isomorphic to the (mul- 
tiplicative) group (Z/mZ)* = {a (mod m) : (a,m) = 1}. The latter group 
will be featured in many further discussions below; in particular, its order 
y(m) = |(Z/mZ)*| known as Euler’s totient function will be computed. 
The automorphisms from G,, (and they only!) permute the roots of unity 
Um, hence leave the cyclotomic polynomial ®,,(x) unchanged; they form 
the Galois group of the polynomial. This explicit description immediately 
implies that (1.8) is a polynomial with integral coefficients, irreducible over 
Z and even over Q (so that any sub-product in (1.8) is not in Q[z]), of 


degree v(m). 
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There is a natural context, in which these irreducible cyclotomic poly- 
nomials are viewed as polynomial analogues of prime numbers, more accu- 
rately, as deformed primes. For historical reasons, a traditional name for 
variable in this deformation is q rather than x, so that the construction 
below is a q-deformation of natural numbers. 

Define g-numbers as 


1 prs mr 
fn] = [lo =F tate te tg 
forn = 1,2,.... Clearly, [n], + n in the limit as q > 1. The q-numbers 


are polynomials of degree n — 1. Unlike the polynomials ©®,,(q) they are 
in general reducible over Q. But because g” — 1 = [J_,(¢— 27/2) we 
essentially know all irreducible factors of [n],: they are cyclotomic. 


Exercise 1.6. (a) Show that, for n = 2,3,..., 
[nla = |] &n(q). (1.9) 


(b) Use the Mébius inversion formula to conclude from part (a) that for 
the same range of n, 
&,(q) = [] fg’. 


min 
m>1 


Formula (1.9) tells us, in particular, that a g-number [n], is an irre- 
ducible polynomial if and only if n is prime, in which case [n], coincides 
with ®,(q). But it also suggests that the formula is a g-deformation of the 
decomposition (1.1). This is a bit harder to believe to, since the gq > 1 


limit in (1.9) results in 
n= II ®,,(1) 


m|n 
m>1 


and this (more regular looking!) product does not really resemble (1.1). 
Nevertheless the result truly duplicates the latter (if we accept to ignore 
numerous ones that appear as values ®,,(1)), because of the following eval- 
uations. 


Exercise 1.7. (a) Let p be prime. Verify that, for a > 2, we have ®pa(q) = 
®,«-1(q?). In particular, &p«(1) = p for any a > 1. 
(b) Show that, more generally, 
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(c) Using part (b) or otherwise prove that ®,,(1) = 1 if m > 1 is not of 
the form p®. 


The next natural step in this story is defining q-factorials 


[nr]! = [nJq! = | [al 
k=1 
whose irreducible polynomial factors are exclusively ®,(q) with ¢ = 2,3, 
4,..., and the g-binomial coefficients 


also known by name of Gaussian polynomials. Using (1.9) and arguing as 
in Exercise 1.3 we conclude that 


ords,(q) [n]! = H for all € = 2,3,4,.... (1.10) 


In particular, this implies that [ ] € Ziq]. (You may also verify the g-Pascal 


n 
™. 


Pe ae tack 
m m m—1 
which together with the boundary conditions [§] = ["] = 1 lead to the 
same conclusion. There is also a g-deformation of the binomial theorem, 
which we touch in Chapter 10.) 

More generally, we conclude that the q-versions of the Chebyshev— 


Landau ratios 


triangle relations 


[ayn]! --- [a,n]! 
[bin]!--- [bs]! 


subject to the conditions )7j_, a; = )05_, 6; and (1.6), are all polynomials 
in Z[q]. This latter property, which we may call q-integrality via the analogy 
of what we have seen in Section 1.2, does not require from us any special 
effort; all we use are the same inequalities (1.6), while the formula for 
ord, n! is replaced with the (somewhat simpler) formula for ordg,(q)[n!! . 
There is an interesting counterpart here for the polynomials D,,(a, b; ¢) 
which is not seen by the numbers D,,(a,b) = D,(a,b;1). We expect [79] 
that all D,,(a, b;q) are not just polynomials in Z[q] but that they also have 
non-negative coefficients. Note that the coefficients of their irreducible cy- 
clotomic factors ®(q) when @ is not of the form p®* have both positive and 
negative coefficients; this is an easy consequence of Exercise 1.7. Therefore, 


D,,(a, 6; q) = 
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the expectation about non-negativity is not trivial; in fact, we do not know 
how to prove it in the above generality, though some particular instances 
can be shown by rather elementary means. For example, the non-negativity 
of g-binomial coefficients follows from the q-Pascal triangle relations. Simi- 
larly, a g-version of the identity (1.7) allows one to show the non-negativity 
of the g-Gessel numbers 

[2m]! [2n]! 

Gm.n(4) = [ml]! [m + n]! [n]! 


for all m,n > 0. 

Exercise 1.8. In this exercise we look at very simple rational functions 

Cae) 

(Lge yar) 

(a) Show that W,(m,n;k) are polynomials in Z[q}. 

(b) Show that the coefficients of W,(m,n;k) are non-negative if k > 
(m—1)(n—1). Can this happen for some k < (m—1)(n — 1)? 


W,(m,n; k) = with (m,n) = 1. 


Chapter notes 


Though we have not yet come across the Riemann hypothesis (RH), it is 
worth mentioning that the Chebyshev—Landau factorial ratios (1.5) show 
up naturally in its potential resolution, thanks to the equivalent Nyman-— 
Beurling formulation. In relation with this, Bober [13] lists all such integral 
factorial ratios subject to the condition s < r+1 (which implies s = r+1). 
Furthermore, the very same ratios show up in arithmetic study of so-called 
mirror maps attached to Calabi-Yau manifolds and in characterisation of 
globally bounded hypergeometric series; none of these advanced topics is 
discussed in this book and therefore we do not even define them properly 
(but check equation (10.15) in Chapter 10 for a definition of generalized 
hypergeometric series). However Bober’s analysis makes crucial use of a 
simple criterion, due to F. Rodriguez Villegas (2005), saying that the inte- 
grality (1.5) when s = r+ 1 is equivalent to the algebraicity of the corre- 
sponding generating series 


3 D,(a, b)z” 
n=0 


(which is a hypergeometric series), that is, the latter satisfies a solution of 
polynomial equation with coefficients in Z[z]. Writing such a polynomial 
down is not plausible for most of the examples because the degree is large; 
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it is equal to 483840 in Chebyshev’s example (a1 1, ag = 30, b; = 6, 
be = 10, bs = 15). The expectation at the end of Section 1.3 raises naturally 
the question whether there are methods to prove the integrality of the 
Chebyshev—Landau factorial ratios without the arithmetic argument. 

In Chapter 10 we return to the q-deformation as a main theme, so that 
we discuss it in a more analytic context (while still aiming at applications 
in number theory!). Here we would only highlight a g-deformation of a non- 
integral—in fact a transcendental number (as we will see in Chapter 6), 


namely the number 7. Among many analytic expressions that define the 
quantity (and we witness some more in Chapter 10) we single out the 
classical representations 


aye Ei 
T 2. aera (1.11) 


fore) j 2 
«=(f e” ae) 


the Gaussian probability density integral. If we now define 
1" 2n+1 


> (=1)"¢ 
Tq =1 >> 1— q2rtl ’ lq| <1, 
n=0 


then it is not hard to check the ‘agreement’ with Leibniz formula: 


due to Leibniz and 


ee (1 a q)Tq = 

lq|<1 
(recall that (1—q)/(1—¢?"*1) = 1/[2n+1]g > 1/(2n+1) and also q?"+! > 1 
as q > 1). It is more difficult (as requires some knowledge from the theory 


of modular forms) to convince yourself that for 7, so defined we have a 


different representation 
love) 2 
2 
Tq = ( S q’ ) , 


nN=— CoO 
which is more in line with the Gaussian integral. Though such q- 
deformations are not unique when based on a single formula, those that 
fit several formulae are usually ones to consider; this strategy essentially 
dictates 7, to be essentially a canonical g-deformation of 7. 


Exercise 1.9. Show that 


co ae 
m=1 


Note that computing the limit of (1 — q)mq as q — 1 is challenging for this 
series representation. 
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Chapter 2 


Prime number theorem 


For the entire duration of this chapter we stick to the standard convention 
of using 7(x) for counting the prime numbers less than or equal to x: 


(x) = S- 1; 
psu 
here x > 0 is a real number. For example, 7(1) = 0, 7(10) = 4, 7(10'”) = 
37 607 912018 and z(p,) = n, where p, denotes the nth prime. No exact 
formula for the function 7(x) is known, though resolution of the famous 
Riemann Hypothesis will give a reasonably simple way to legally compute 
it. Here is some historical information about development of our knowledge 
about the distribution of primes: 


e Euclid: 7(x%) > co as > ~w; 
e Euler: m(x)/x > 0 as & > cw; 
e Chebyshev (1848): if the limit 
m(x) 


a-F00 «/ Ina 


exists then it is equal to 1; 

e Hadamard and de la Vallée-Poussin (1896): the asymptotic distri- 
bution of the prime numbers among the positive integers is given 
by 


n(x) ~ —— as © —> 00. 
Ing 


2.1 Chebyshev’s bounds for primes 


Lemma 2.1. Let n be a positive prime and K = |em(1,2,...,2n+1) (the 
least common multiple). Then K > 4”. 


13 
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Proof. Consider the integral 


1 
I =| x”(1—2)"dz. 
0 


Since 0 < 2(1— x) < ; on the interval 0 < z < 1, we have 0 < IJ < 1/4”. 
On the other hand, 


n+1 ote 2n 


a" (L— ax)” = ana” + Anqid -++ + dont 


for some integers Gn, Gn+1,---;@2n, So that the integration gives us 
= an 4 An+2 bak i a2n 

m+1l n+2 2n+1 
This implies that K x I is a positive integer; in particular, KJ > 1. The 
latter estimate together with the bound I < 1/4” implies the claim. 


I 


Lemma 2.2. The product [| 
4” for each x > 2. 


p<xP over primes is bounded from above by 


Proof. It is sufficient to prove the statement for integral x, as then Eee p= 
Tpe|a) P< 47! < 4°. 

Use the mathematical induction on integer x > 2. The statement is 
clearly true for « = 2,3. Assume that it is true for all 2 < n, where 
n > 4, and show that it also holds for x = n. If n is even then Lea? = 
ayan = 4"~-! < 4". Therefore, let us concentrate on the case of odd n, 
so that n = 2m — 1 for some m > 3. Split our product into two parts, 


Ile= I] e=Ife~« II pcan(?r—"), 
p<n p<2m—-1 p<m m<p<2m-1 


since all the primes from the second product divide the factorials in the 
numerator of the binomial coefficient 


ae ; _ (Qm-1)! 


m ~ m!(m—1)! 


but do not divide the factorials in the denominator, so that Theorem 1.1 


implies p | (o.) for all m < p< 2m-— 1. Using the binomial theorem in 
the form 
2m—1 
2m—1 ie 2m—1 2m—1 2m—-1 ~ m1 
tay art Cy) A a ae 


k=0 


the case of odd n as well. 


wre deduce: (Cr sea =3 ak, huss lacp a AM A an 
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Theorem 2.1. There exist absolute positive real constants a and b such 
that for all x > 2 we have 
x 


— b 
aa T(x) < 


aR 
Proof. We will prove the theorem with the constants a = +In2 and b = 
61n2 for x > 6. 

Choose n such that 2n +1 < x < 2n+3 and take K = lIcm(1,2,..., 
2n+1) = pf'---p%, where s = 7(2n +1). First notice that p** < 2n+1 
for all i, as each p** must appear on the list 1,2,...,2n +1 of the first 
natural numbers by the definition of the least common multiple. Therefore, 
K = pf! +--+ pes < (2n+1)*. On the other hand, from the estimate derived 
in Lemma 2.1 we find out that (2n + 1)* > 4”. Taking the logarithm gives 

2n x—3 x/2 x 


> 1(2 1)= > a : 
pC eae °> iog,(2n +1) ~ log,a — log, x “Ina 


Now proceed with the estimate from above. We have 


no= ys S- 1+ a 1 


psa pSJVa VE<pse 
logs p 2 
<a(Ve)+ >> : “Fe <VE+i = & bee. p 
renee &2 S2 nee 
2 4x 6x 


=Ve+ 


logs [[ p< Va 


< , 
ren logy x ~ logy x 


logs x 


where Lemma 2.2 and the inequality /z < 2x/log,x for x > 6 were 
used. 


Chebyshev in 1848 deduced the estimates with much better constants 
a & 0.92129 and 6 & 1.10555. To achieve, for example, the better choice 
of b he used the integers (1.3) from Exercise 1.5 in place of the binomial 
coefficients ag as we did in the proof of Lemma 2.2. 


2.2  Riemann’s zeta function and its basic properties 


Riemann’s zeta function is a complex-valued function 
p 


Co 


1 
C(s) = ne 
n=1 
of argument s = 0 + it € C, where n° = e§™" = n?n"* = n?(cos(tInn) + 


isin(tlInn)) so that |n*| = n7. 
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Lemma 2.3. The series defining the zeta function converges absolutely 
in the half-plane Res > 1 and defines there the analytic function ¢(s). 
Furthermore, 


are analytic in the half-plane D : Res > oo and the series >, f(z) 
converges absolutely and uniformly, because of the uniform estimates 


< 


n=1 n=1 
valid for all s € D. By the Weierstrass theorem the sum of the series 
¢(s) = 2°, fn(s) is analytic in D and its derivatives ¢‘")(s) are given by 
yr, fh (s) in D for all k =1,2,.... 

Because the choice of o9 > 1 is arbitrary, the analyticity of ¢(s) and 
representation of ¢‘*)(s) remain valid in the domain Res > 1. 


The von Mangoldt function is defined for positive integers n by 


] ifn = p* 
nin) = 4p ifn =p", 


0 otherwise. 


Lemma 2.4. In the half-plane Res > 1, the representation 


C"(s) _ yo Alm) 
¢(s) 2 ns 


is valid. 


Proof. As a warm-up we observe that, for n = p{*---p%™, we have 
m ay ; m ai m 
TA = Aw) = ma = avinn 
d|n i=1 j=l i=1 j=l i=1 


a 


=In(pt'--- ppm) =Inn 


(see Exercise 1.1). 
Again, let og > 1 be fixed. We have 


AR 6 SA. SA 
Oe ete 
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(both series converge uniformly in the domain Res > a0) 


= 22 aoe SA =o =-c1@. 
=1 k=1 n=1 k\In n=1 


The result now follows from noticing that a9 can be chosen arbitrarily close 
to 1. 


Theorem 2.2. ¢(s) 40 for complex s from the half-plane Res > 1. 


Proof. Assume this is false and ¢(s) vanishes at s = s9, Reso > 1. Then 
the logarithmic derivative of ¢(s) has a pole of order 1 at this point: 


co -< ao 


in a neighbourhood of s = so; in particular, no limit exists as s > so. On 
the other hand, by Lemma 2.4, 


Ms) ss Ae) 


ii 
Puew ¢(s) =i nso 


the latter being an absolutely convergent series. Contradiction meaning 
that no zero of ¢(s) exists in the half-plane Res > 1. 


: 


In fact, Riemann’s zeta function also does not vanish in an open region 
that includes the entire line Res = 1. For our purposes though it will be 
sufficient to check that ¢(s) 4 0 on the line, without entering the critical 
strip 0 < Res < 1. 


Exercise 2.1. In the half-plane Res > 1, prove that 
> 7 (7) 1 oe) 
C~)= and —~= ‘ 
( ) 2 ns ¢(s) = ns 


where the function 7(n) is defined in Exercise 1.1 and ju(n) is the Mobius 
function. 


n=1 


2.3 Analytic continuation of ¢(s) to the domain Res > 0 


Lemma 2.5 (Abel transformation). Let {a,}?°., be a sequence of complex 
numbers and g(t) a complez-valued differentiable function of real variable 
t € [l,oo). Then 


YS aglh) = Alaigte)— f° ag ae 


1<k<x 
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where 
A(t) = S° ap. 


1<k<t 


Proof. Use the mathematical induction on n, where n— 1 < x < n (in 
other words, n = [x]). For n = 1 we get the obvious identity a,g(1) = 
A(1)g(1). Assume that the required equality is true for all x < n; we 
will then demonstrate its truth for  € (n,n + 1]. Introduce the auxiliary 
function 


B(x) = A(w)ale) — f Ata 
1 
so that we need to show that B(x) = )7,,<, ang(k). We have 


B(x) — B(n) = A(a)g(a) — A(n)g(n) — a A(t)g'(t) dt 


x 


0 ifa<n+l, 
= (A(x) — A(n))g(@) = 

Qn4ign+1) ife=n+1. 
By the inductive hypothesis B(n) = S*7_, axg(k) implying the desired 
formula B(x) = S0,<, arg(k) forn<a2<nt+l. 


Consider the Abel transformation in a concrete situation. Take a, = 1 
and g(t) =t~*, so that A(x) = )’,.<, ae = [2]. Then 
N N 
1 t| dt 
S- — = oe ang(n) = A(N)g(N) + | we 
n=1 Me 1 


n<N 
N ‘a dt N {th dt 1 sti-s 
= t § 8 = 
Ns Le 
1 8 1 


N N 
, | febet 


ts+1 ~ Ns-l : l—s ts+1 


_ 8 pas 
TNL Te Ne fog ap ee 


1 8 N {t} dt 


(l—s)\Ne@1s-1 “Jf, ett’? 


where {t} is the fractional part. Passing to the limit as N — oo we obtain 


= 4 oe © {t} dt 
Oa), ee 


t=1 1 
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To see the correctness of the limiting passage observe that, for each n > 1, 


the function 
mrt Eat "ttn 
hn(s) =| peti =| peti dt 


is analytic and the series 


= Dh evade © {th dt 
Dye) en Se roo 


converges absolutely and uniformly in the domain Res > go for any oo > 0, 
because of the estimate 


{t} 1 
tstl = fob 
implying 
mois f a= 
n\$S = i 
~ J, tet! on? a(n +1)? 
hence 
N N 
1 1 1 1 i. a 
hn(s)| < = = 
2 ols (ss cy) g o(N+1% ~ a ~ a 


for all integers N > 1. We summarise our finding in the following statement. 


Theorem 2.3 (analytic continuation of ¢(s)). The meromorphic function 


* S °° {t} dt 1 °° {t} dt 
= — 1 
fal 4 (beet = = re ate 


defines the analytic continuation of ¢(s) to the half-plane Res > 0, where 
it has a single pole of order 1 at s=1 with residue 1. 


Remark. Another way to analytically continue Riemann’s zeta function to 
the strip 0 < Res < 1 is by means of the representation 


Using it one can show that ¢(s) does not vanish for real s in the range 
O<s<l. 
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2.4 Euler’s product and absence of zeros of ¢(s) on the line 
Res=1 


A multiplicative function is an arithmetic function f(n) of a positive integer 
n with the property that f(1) = 1 and, whenever a and b are coprime, then 
f(ab) = f(a) f(b). A function f(n) is said to be completely (or totally) mul- 
tiplicative if f(1) = 1 and f(ab) = f(a) f(b) holds for all positive integers a 
and b. 


Lemma 2.6. Let f be a completely multiplicative function for which the 
series S = S°~_, f(n) absolutely converges. Then 


S=][a-fe), 


where the product is over all primes. 


Proof. First of all, check that |f(n)| < 1 for all n > 2. Indeed, if this is not 
the case, |f(n)| > 1 for some n > 2, then |f(n*)| = |f(n)|* > 1 so that the 
necessary condition f(n*) + 0 as k > oo for convergence of the series S is 
violated. With the bound |f(p)| < 1 for any prime p in mind, we can write 
the geometric series 


Mi 


=F)" f(o)* = >— f(r’), 
k=0 


which together with the complete multiplicativity of f(m) and the funda- 
mental theorem of arithmetic (Theorem 1.1) imply 


T]@-fm)y7= SS fm=s- Se f(n). 


psu n=pih--pom n:p|n for some p>x 
Di<z 


It follows from this result that 
S—][a-f@)"|}< a fms SOF = SO FO). 
psa n:p|n for some p>x n>x n>|«] 


The latter sum is a tail of the absolutely convergent series )>~_, |f(n)], 
thus it tends to 0 as x + oo. This proves the required limiting relation. 


Theorem 2.4 (Euler’s product for ¢(s)). In the half-plane Res > 1, the 
following representation takes place: 
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Proof. Apply Lemma 2.6 to the completely multiplicative function f(n) = 
1/n’. 


Exercise 2.2 (Euler). Use Theorem 2.4 to show that the series 
1 
up 
diverges. Conclude from this that there are infinitely many primes. 
Now notice the following nice but elementary inequality. 
Lemma 2.7. We have 
\(1 —r)3(1 — re®®)4(1 — re"?)| <1, whereOQ<r <1. 


Proof. Denote M = |(1 — r)3(1 — re’’)*(1 — re?*®)| and observe that 
Reln(1 — z) = In|1 — 2| for all z inside the unit disc, |z| < 1. Therefore, 


InM = 3ln|1—r|+4In|1— re®| + In|1 — re””| 
= Re(3In(1 — r) + 4In(1 — re“) + In(1 — re?) 


Oe a OO th 100 Ot, 20n8 
r IEE: Le 
=—Re (s y 4 y ) 
n nr 


n=1 n=1 n=1 


= -»S- ee (3 +4 cos nO + cos 2n8) 


=-S*>* x 2(1 + cosnd)? <0 
n 


implying that M < 1. 


Theorem 2.5. If Res =1 then ¢(s) £0. 


Proof. It follows from the lemma and Euler’s representation of Riemann’s 
zeta function (Theorem 2.4) that 


IC? (a)CA(o+it)¢(o+2%t)| = [] |-p- Pap) 4p 7) > 1 
Pp 


for 0 > 1. Furthermore, Theorem 2.3 implies that 


o {dt oo 
= < 
(0) a er 
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so that ¢(0) = O((a-1)~!) aso 4 1*. Assume that so = 1+ ito for some 
to £ 0 is a zero of ¢(s) on the line Res = 1. Then ¢(0 + itp) = O(s — 59) = 
O(a — 1) and ¢(¢ 4+ 2ito) = O(1) aso 4 17. Then 
IC? (a)¢*(a + ito)¢(o + 2ito)| = O((0 —1)-*- (@ —1)*- 1) 
=O(¢-1) aso lt, 
so that |¢3(0)¢4(o + ito)¢(o + 2ito)| can be made arbitrary close to 0 con- 
tradicting to the earlier established bound for the expression. 


2.5 Upper estimates for ¢’(s)/¢(s) 


In what follows s = o+it. The goal of this section is to give upper estimates 
for the absolute value of the logarithmic derivative of ¢(s) in the domain 
Lees 2, eS 3: 


Lemma 2.8. In the domain 1 < o < 2, |t| > 3, the following estimates 
take place: 


IC(s)| <5In|t]| and |¢"(s)| < 81n? |t|. 


Proof. In the domain under consideration, the function ¢(s) is analytic and 
computed by the formula 


a al 1 © {4} dt 
¢(s) = d ns T (s _ 1)Ns-1 e fstl 


(see Section 2.3). Differentiating both sides of the representation in the 
domain we also get 
N 
Inn 1 InN 
/ — 
C(9) = a ne (s—1e2Ne! (s—1)N=! 
{th dt  {t}Int dt 
tstl 3 N tstl 


In these formulas we choose N = ||t|| > 3. Then 
N N 


n=1 n=1 


=1+4+ InN < 2In|t|, 


soe ee 
GE le ee 2 3 3 x 


2 2 
_ In2 | In3 | In“ N ln 3 <n? N < In? |¢], 
2 3 2 2 
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° {th dt e dt i dt 1 
< < Sea sd, 
N tstl N totl N t2 N 
 {thdt| — js] . o+|t] | 24+(N +1) 3 
=< S < —< 
s | ett|-N- N > ON Pag 
— : < u < 
(s—1)N*-1|— |s—1| N7-1 — |s—1] — 
: ee 
(s—1)2Ns-1|— |s—1]2 — 
Hes ANeg PN ga ia 
GDN pad eee 
and, finally, 
he {t}Intdt <\ f° “= ate 1+Int\|* 
Ss — 
 tstl n € i pine 
= +m) <2 + m/e). 
Thus, 


I¢(s)| < 2In|t} +1+2<5ln|él, 
|¢’(s)| < In? |t| + 1 + In|¢| +14 2(1 + In|t|) < 8In? |t]. 


Lemma 2.9. In the domain 1 < o < 2, |t| > 3, the following estimate 
holds: 


<Cln’ |t|, where C = 2%. 


ae 
¢(s) 
Proof. By Lemma 2.7 and Theorem 2.4 we have 


IG?(a)¢4(o + it)C(o + 2it)| > 1 


(see also the proof of Theorem 2.5). In particular, we conclude that 


G(s) = [¢(o + ét)| > |¢(o)|-F/4\¢( + 288) 7/4. 
It follows from Theorem 2.3 that 


o °° {t} dt ol 
= < 
Ste) on as , tet ~o-1 
for all o > 1, so that 
2 
¢(a) < < 2C In? |t| 
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for all o > 0; =1+1/(C ln’ |t|). Lemma 2.8 implies the estimate 
IC(o + 2it)| < 5In(2|t|) < 16 In |¢I, 
so that 
IC(s)| > (2C n° |t|)~3/4 (16 In ||) 1/4 = 1607! In7 |e] 
in the domain 0; < o < 2, |t| > 3. But then for o in the range 1 <a < oj 
we have from the mean value theorem and the estimate of Lemma 2.8 for 


¢'(s), 


I¢(s)| = IC(o1 + 44)| — |C(o1 + it) — C(o + it)| 
= |C(o, + it)| — / are 


> 16C-1 In |t| — (01 — 0) x 8In? |e| 

>16C-1In- |¢| — 8C-*in-* |é| 

= 8C-1 In” |e}. 
This means that the estimate 

I¢(s)| = 807! n~ |e] 

holds for all s from the domain 1 < o < 2, |t| > 3. Then Lemma 2.8 is used 
again to conclude with the estimate 
¢"(s) 8 In* |¢| 
¢(s) | ~ 80-1 1n~* |e 


= Cln? ||. 


2.6 Chebyshev’s function w(x). 
Reduction of the prime number theorem 


Recall the prime counting function 

(x) = S- 1 

psu 
and introduce the related Chebyshev function 
v(x) = S>A(n) = > Inp, 

n<x pr<x 
where the summation is over all pairs of primes p and exponents m subject 
to p” <x. The latter inequality implies that m < (Inx)/(Inp) so that 


Ing 
(0) =O] BS] me. 
np 
pKa 
In particular, we have 


Ing Ina Inz 
a(x) Ina — W(x) = o( - ee) Inp= Le} ine> 0 
pee Pp Pp os Pp 
for all x > 0. 
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Lemma 2.10. The following asymptotic takes place: p(x) = n(x) Ina + 
o(a) as Z > oo. 


Proof. Clearly, 


for all primes p, and we also have 


1 
ne Inp=Ing — ea inp <Inz—Inp=In— 
Inp Inp p 


when p < x. Using now the upper bound z(y) < by/(Iny) for y > 2 from 
Theorem 2.1 we find, for 2 > 8 > e?, that 


r(z)Inc-Y(a)= >> {map het S- {ae hinp 


pSa/(Ine) /(Ine)<pSa 


< S- Inp+ a In = 


p<a/(Ine) a/(na)<pse © 


2 nfca)t In(In x) 


p<a«/(In x) Inx)<p<a« 


<In me Oe ee +InIna- r(x) 
Ing Ing 


Fis Wa iis ba _ bx(L+ ning) 
Ing log x Ing 


Thus, 
a(x) Ina — (a) se b(1 + InInz) 


0< 
x Ing 


+O ast—->oo 


and the required asymptotics follows. 
Lemma 2.10 means that the asymptotic distribution of primes, 
x 
T(x“) ~ — asxr ro, 
Inz 


is equivalent to (x) = x + o(@) as © > oo. The next statement reduces 
establishing of the latter to verifying the asymptotic relation w(x) = x+o(x) 


w(x) = a OO ae. 


Lemma 2.11 (further reduction). If w(x) =x+o(a) as x > oo, then also 
w(a) = 2+ 0(x) and so r(x) ~ a/(Inx) ast > &w. 


as x — co, where 
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Proof. Suppose that the asymptotics w(z) = x + o(x) as % — oo is estab- 
lished. Take an arbitrary ¢ in the range 0 < ¢ < 1. Because the function 
w(t) is monotone increasing, we get 


(1t+e)x 
w((1 + e)ar) — w(x) =| w(t) at 


CTR ai 
Sue) fF = ve) na +e), 


Bd t 
hence 
lim sup < lim Ade) ee) = Z : 
peau In(1 +) 2-00 x In(1 + ¢) 


Since the estimate is true for any € > 0, it remains valid as « + 0* and we 
obtain 


lim sup ue) < lim oe) 
200 x e 30+ In(1 + é) 
Similar consideration leads to 
° t) dt ee ade 1 
w(e)-w((i-2)e)= f OS cya ff Fwy te, 
(i-e)e (1-e)a t l-—e 
therefore 
lim inf v(z) > ! ia w(x) —w((L—e)a) _ E 
~L—>0O OU: In(1/(1 _— €)) rove) xr _ In(1 = é) 
and 
lim inf v(a) > lim —— 1. 
00 x e>0t — In(1 = €) 


2.7 Integral representation for w(x) 


Lemma 2.12. For a,b > 0, we have 


Oe heer {o. ifb>1, 
Pe as 


Qri 8? 0 if0<b<1, 


a—ioo 
where the integration is performed along the vertical line Res =a. 


Proof. For s =a+ it, 
b® 


s2 


6% 
~ ate? 


therefore, the integral under consideration converges absolutely. Consider 
first the case b > 1 and the integral 
1 b° 


== / =ds 
; 2 
2nt Jp s 


I(r) 
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Fig. 2.1 The contour for b > 1 (left) and 0 < b < 1 (right) 


along the closed contour (on the left of Fig. 2.1) consisting of the arc of 
radius r centred at 0 from the left of the line Res = a and the segment 
joining the endpoints of the arc on the line. 
Inside the contour, the integrand has a single pole of order 2 at s = 0 
and, because of the Laurent expansion 
fe . ems 1 Inb il 
=— =a edn Obert, 


8 
we have I(r) = Inb so that 
1 a+iro ps 1 s 
= apd =Inb— 5 f ae 
s a. 


271 Jairo Qt Jare 8? 
where rp = Vr? — a?. Furthermore, |b°| = b®°* < b* on the contour, since 
b> 1 and Res < a, and for the integral along the arc alone, 
1 bs 1 b® . b* 
/ asl < | [ds | 3 Sel asT > oO, 
arc Qn arc r r 


Qi 8? 


implying that 
1 at+iro ps 


San Jn ee) ds = lnb. 


a—iro 

In the case 0 < b < 1, we use the closed contour I depicted on the right 
of Fig. 2.1. The corresponding integral I(r) now vanishes, because there 
are no poles inside TI. In this case |b*| = bes < b* on T, so that the same 
estimate for the integral along the arc is valid, and we conclude with the 


value as above. 
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Theorem 2.6. Fora >1 and x > 2, the following integral representation 


w= fo (Sa) 


and the integral absolutely converges. 


is valid: 


Proof. It follows from Lemma 2.4 that on the line Res = a, 
é"(s) oS A(n) “Inn 
=“ < < ——— 
| ¢(s) a d ne — Ss ne 
This implies the absolute convergence of the integral. In the Abel trans- 


formation, Lemma 2.5, take a, = A(k) and g(t) = In(a#/t). Then 
A(t) = incr de = P(t), Chebyshev’s function, hence 


S> A(n) n= = W(x) In1 + a w(t) “ = w(z). 


n<u 


It follows from Lemma 2.12 that 
1 oe a\*ds  Jin(a/n) ifn<a, 
27% Ja—ico \N en) 0 ifn> a, 
while Lemma 2.4 implies 


(-$2)$-Ea0(2) 3 


n=2 


Combining the two results we obtain 


ss= Fain =F ans [" (YS 


n<ux n<u —too 
co 1 a+ioo 8 d 

= SA) = i, (=) = 
oars 271 Jaicn \n/J 8 


ll 

iw) 
yle 

= 
os 
| ¢ 
8 

= 
= 
a 
3/18 
NV 
R,] 
wv] & 


ale a+too 9) 2 
mil. (aa) 


the desired identity, so we only need to justify the interchange of summation 
and integration. For the latter, notice that 


lsn(2) 3 < an(2)" 
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on the contour of integration Res = a, hence the series 
= x\*1 
Yam(2) 5 
nj) §s 
n=2 


converges absolutely and uniformly on the line. This means that for each 
T > 0, we have the equality 


1 atiT / 8 co 1 atiT sq 
iy _ Os) =u-> sf rc ea ee 
Qri Jair ¢(s) / s? Ont Jour nj) s? 
while the estimate 
Inn (x\" a dt Inn fx\* 
< — 
~ In \n _~ +P? 2a \n 


1 ated. xz \*ds 
ae vin) 2 


implies that the series over n converges uniformly on the set T’ > 0, so the 
transition as T’ — oo is legal. This completes the proof of the theorem. 


2.8 The principal asymptotics of w(x) 


Theorem 2.7. Suppose that ¢(s) does not vanish for s = 0 +it inside the 
closed rectangle 7 < o <1, |t| < T for some yn < 1 and T > 0. Then 
w(x) = (1+ R(x))z, where 


1 ) gst 
R(x) = =| (-< ds 
(7) ant Jr(T.n) ¢(s) / 8? 
for the contour T(T,7) depicted on the left of Fig. 2.2. 


Proof. For u > T, consider the contour T = [(u,T,7) on the right of 
Fig. 2.2, which is symmetric along the real axis and in which the imaginary 
parts of points B, C are equal to u. The function ¢(s) does not vanish inside 
the contour and has a single simple point with residue 1 at s = 1, so that 
¢(s) = 1/(s —1)+ f(s) for some f(s) analytic inside and on the boundary 
of [. Then 

cY(s)_ 1-(s-1)f"(s) | 1 

Cer RG Mya: eT) 


Gol 


has a single singularity inside [—the simple pole at s = 1 with residue 


mis (-2) Sas =. 
2n% T(u,T,n) ¢(s) 8? , 


hence the function 


x*|,-1 = @ implying 
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th ty C B 
é i Tr E 
| r 
| 
| 
711 o 2. 0 
| ae ie 
IP eis) F 
H A 


Fig. 2.2 The contour in Theorem 2.7 (left) and in its proof (right) 


Now estimate the integral along the segment BC (and similarly, along HA): 
1 2+iu y s—1 l 9 

all £49) “ds <C—~ 2°30 as U —> OO. 

amt Situ C(s) 8? ue 


Thus, taking the limit as wu — oo in the former expression and using 


“= aif (to) 


(here the convergence is absolute), we arrive at the desired claim. 


Lemma 2.13. For the function R(x) defined in Theorem 2.7 we have 
R(x) +0 asx oo. 


Proof. Notice that we can manipulate with choosing 7 < 1 and T > 0; the 
only constraints is that ¢(s) should not vanish inside and on the sides of 
the rectangle 7 < Res <1, |Ims| < T. 

Take an arbitrary ¢ > 0. It follows from Lemma 2.9 that 
c(lt+it) a C ln? |t| 
Cl +it) (1 +it)2|— 142 


This means that we can pick up some T = T(e) > 3 independent of x, for 


which 
il 1+ioo ! s—l 1 co l 9 t 
a / _O(s)\ asl Z fp Gha<: 
Qnt Jasir ¢(s) J 8? Qn Jp 14+? 5 


for |t| > 3. 
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1-iT y s—l loo) 9 
all ¢'(s)\ a nae 1 | Cln lel ace 
271 J4 00 ¢(s) s? Qn Jr 14+ 5 
Since the function ¢(s) does not vanish on the interval [1 — iT, 1+ iT] for T 


so chosen, we can choose some 6 < 1 in such a way that there are no zeros 
of ¢(s) inside the rectangle 7 < Res < 1, |Ims| < T. The function 


Gi(s)\ 1 
G(s) / 8? 
is continuous on the sides [1—iT',n—iT], [y7—iT,n+iT] and [7 +iT,1+7T] 


of the rectangle, hence 
Gi(s)\ 1 
G(s) / 8? 


on those sides for some M = M(T,n) > 0. This implies that 


1 14+:.T a s—1 M al M 1 
=a | ( os)? 5 as 2 / eo! do = / ev ine do 
20t In4iT G(s) J 8 2m Jn 20x Jy 
M 


and 


<M 


= uae 
2rxzlnx eau 

< M Inz M 
ping 
2rxlnx 27 Ina 


and, similarly, 


n-iT / s—l 
=| Oe gale 
Qni yar ¢(s) Js? Qn Inax 


(ey ceed f s-1 Ma?-! ft MT 
ph ces A at 
Q0t Sys C(s) J s 20 _. T 


By choosing x sufficiently large, x > X = X(T,7), we can make all these 
latter three integrals less than ¢/5. Combining the five estimates for the 
integrals involved in computation of R(x), we see that there is a choice of 
T >0,7 <1 and X > 0 such that 


ae SiO a ; 
sl Oy ea ae 


for all s > X. This concludes the proof of our lemma. 


while 


|R(x)| = 


Theorem 2.8 (prime number theorem). For the prime counting function 
m(x), we have 
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Proof. It follows from Theorem 2.7 and Lemma 2.13 that w(x) = x + o(z) 
as x — oo. By Lemma 2.11 this implies the asymptotics of a(z). 


Exercise 2.3. Let py = 2 < pg =3<-+-: < pn <-:- be the sequence of all 
primes. Show that 


Prow~ninn asn>o. 


Exercise 2.4. For Res > 1, show that 


+00 geal 
T'(s)¢(s) -/ da. 


ew —1 


Exercise 2.5. Show that if a > 1 and x > 1 is not an integer then 
Lf! ae 
wn)=se fas, 
2 ani a—too ¢(s) § 


where j1(n) is the Mobius function. 


Hint. Use Exercise 2.1. 


Exercise 2.6. Use the previous exercise to deduce that 


ye p(n) = 0(%) asa — oo. 


l<n<a 
Chapter notes 


Based on Euler’s ideas (1737), Riemann’s memoir Ueber die Anzahl der 
Primzahlen unter einer gegebenen Grosse (1859) set up an analytic ap- 
proach to questions concerning the distribution of prime numbers, in par- 
ticular connecting those with the zeros of ¢(s) as a function of complex 
variable s. Realisations of Riemann’s ideas for the asymptotic law of the 
distribution of primes were found independently and in the same year (1896) 
by J. Hadamard and Ch. J. de la Vallée Poussin (in Chapter 8 we witness 
another longstanding problem with independent and simultaneous resolu- 
tion—such coincidences are common in number theory). Decades later 
different proofs of the prime number theorem were found including the ‘ele- 
mentary’ proofs (without use of complex analysis, at the cost of being more 
manipulative) of A. Selberg and P. Erdés (both published in 1949, with 
independence of Erdés’s proof disputed). 

It is generally recognised that problems about prime numbers represent 
very old and most challenging problems not only in number theory but in 
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mathematics in general; for this reason Analytic Number Theory is often 
understood as a field dedicated exclusively to those. A spectacular recent 
progress is already hard to overview, so we limit ourselves to mentioning the 
Green—Tao theorem about arbitrarily long arithmetic progressions of prime 
numbers [37] and the infinitude of bounded gaps between primes proven by 
Y. Zhang [86] and later by J. Maynard [57] using a different method. 

We return to the ‘prime’ topic in Chapter 5 to discuss the infinitude of 
prime numbers in arithmetic progressions, the result proven by Dirichlet 
long before Riemann’s memoir. 
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Chapter 3 


Riemann’s zeta function 
and its multiple generalisation 


3.1 Euler’s gamma function 


Riemann’s zeta function is always accompanied by Euler’s gamma function 
T(z) defined through the product expansion 


Co 


7a = ze? |] (1 4 zest (3.1) 


k=1 


for its reciprocal. Here 
F 1 1 1 
y= lim (1+=+-=+---+-—-—logn 
= 0.57721566490153286060651209008240243104215933593992 ... 
is the Euler (or Euler-Mascheroni) constant. A theorem of Weierstrass 
guarantees that 1/I'(z) is an entire function with zeros at z = 0,—1, —2,..., 
and many properties of the gamma function, like the difference equation 


I(z+1) = 2T (2), (3.2) 
the reflection formula 
PoOra=nS- Il $e) ot (3.3) 
Ze Z)= = . 
eas k? sin 1z 


and multiplication formula 


r(er(: + *)r(: + =) T(z oe - *) = (22) V2 n-41 7 (nz), 
(3.4) 


follow straight from the defining product. 
Exercise 3.1. Prove equations (3.2)—(3.4). 
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We also take for granted from a complex analysis course the evaluation 


it e *t?~'dt =T(z) (3.5) 
0 
of the Eulerian integral (of the second kind) in the domain Rez > 0. 


Proposition 3.1. The logarithmic derivative w(z) = I’(z)/T(z) of the 
gamma function serves a generating function for the values of Riemann’s 
zeta function at positive integers. More specifically, 


Co 


wi-z)= -y- So ¢(m+ lz m for |z| <1. 


m=1 


Proof. It follows from the logarithmic differentiation of (3.1) that 


ee) Jar rae) 


for z £ 0,—1,—2,.... Furthermore, from (3.2) we have w(1 +z) = 1/z+ 
w(z). Thus, 
1 1 al 
1 = =v 1 
W-g=F-veaart DD g(1+ aR) 
Co 1 Co 
1+ DEG) tL Lae 
k=1 m=1 m=1 


with all the internal series converging in the disk |z| < 1. 


Exercise 3.2. In this exercise we compute the Eulerian integral of the first 
kind 


1 
He,2)= i 2°11 — 2)?! de, 


where Rea > 0 and Re > 0. 
(a) Verify the following properties: 


B(a, 8) = B(G, a); Ba, B ly 1)= B 
B 


B(a, 8) = Bla + 1, 8) + Bla, B+ 1); B(a, 8+ 1) = a+B B(a, 8). 


(a+ 1,8); 


(b) Show that 


T(a@)I(8) = 4 lim TT ae f(x,y) dady =4 lim nf, £ f(x,y) dx dy 
R-00 0,R]2 


Riemann’s zeta function and its multiple generalisation 37 


where f(a, y) = e7 @ +97) 720-1y28-1 and Sp is the circular sector ?-+y2 < 
R,«x>0,y> 0. 
(c) Pass to the polar coordinates « = rcos6, y = rsin@ in the integral 


1 f(x,y) dx dy 
SR 


and use part (b) to conclude that 


P(a)T(8) 


Ble) pea py: 


Hint. (b) Write 
love) ae P R ; 
I'(a) — if etto7t dt — 2 | e = 20-1 dz — 2 lim e” p20 dx 
0 0 R-0o 0 


and, similarly, for (3); then show that 


Hf f(e,y) andy ~ | flesy)ae dy) 0 as Ro. 
[0, R]? Sr 


Exercise 3.3. (a) Show the integral expansion 


1 1— +271 


in the half-plane Rez > 0. 
(b) Prove that, for n = 1,2,3,..., 


n-1 
vin) =-7+ oF. 
k=1 


3.2. Hurwitz’s zeta function 


In order to analyse the properties of Riemann’s zeta function we turn our 
attention to its slightly more general version 


1 
¢(s,a) = Cor (3.6) 
known as Hurwitz’s zeta function. In this expression we treat a as a real 
constant from the interval 0 < a < 1 (though one can allow a to vary 
over the real line, and even over the complex plane); again, the series in 
(3.6) defines an analytic function of s in the region Res > 1. Observe that 


C(s, 1) = ¢(s). 
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Proposition 3.2. For Res > 1, 


1 ee) gs lea 
¢(s,a) = Tis) | —— daz. 


Proof. We start with the following consequence of (3.5): 
(a+n)~T(s) = | go-to (n+a)e dy, 
0 
Taking 6 > 0, we have in the domain o = Res > 1+, 


T'(s)¢(s, a) 


i 
iu 
ge 

iM 

oo 
g 
gq 
| 
o 
| 
> 
+ 
Q 
- 
Qu 
8 


8 


Co 7s—lp—aa oo yo le—(Nt+1t+a)ax 
= lim | dx | dx 
No>o\Jy 1l-—e* 0 l-—e7 
oo ,,s—1,—az co »8—-1,—(N+a)a 
= lim | ee dx | cae dz }. 
No>o\Jy 1l-—e* 0 Chal 
Since e” > 1+ for x > 0, the absolute value of the second integral is 
estimated from above by the quantity 


ip a?—*e-(Nta)2 da = (a+ N)'-°T(o — 1), 
0 


which clearly tends to 0 as N — oo in view of o—1>6> 0. This gives 
the desired formula for Res > 1+, hence for Res > 1. 


For real p > 0 (possibly, = oo), introduce a (Hankel-type) contour 
D = D(p), which starts at z = p, passes once around the origin into the 
positive direction (without crossing the half-line z > 0) and ends up at 
z =p. Our principal interest is in the integral 


__,4)\s—-1,—-az _ »)s-1,-az 
Dio) 1-e ere Ing) Ie 


for a fixed s from the half-plane 0 = Res > 1+ 6. To avoid the un- 
wanted poles of the integrand, we further assume that the contours D(p) 
do not contain the points +27in for n = 1,2,.... We specify the branch 
of (—z)*-! = e(s-Dles(—*) by choosing the log(—z) to be real for negative 
z; then —m < arg(—z) < 7 on the contours— this makes the integrand a 
single-valued function on D(p). Of course, the integrand is not analytic 
inside D(p) but we can still deform it within C \ [0, 00) to the contour going 
along the upper bank of the cut [0,00) from p to « > 0, then making a 
circle of radius ¢ around the origin and finally returning from ¢ to p along 
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the lower bank of the cut. At the beginning we have arg(—z) = —7, so 
that (—z)*-! = e~™(8-) zs-1, and at the end we get arg(—z) = 7, hence 
(—z)*-1 = eT™s-l) 25-1. We set —z = ce” on the circle. Therefore, 


_»+\s—l1lp,—az 
i, ee 
De) 1-e 


E ,s—1,—axr T 10s ,ae(cos 0+ sin 0) 
. x e€ 7 ee e€ 
a | dz 4 if ee) dé 
p 


1—e-z Le [= ef (cos 0+i sin 0) 
P »s—-l1l,—ax 
—1) av e 
ene i dx 
2, [oer® 


p ge lena wT ce's9+ae(cos 0+isin 0) 
= —2isin7s or ge ~~ dé 
- ee ef (cos 0+isin 0) 
€ 


for 0<e<p. Ase — 0 we have e*-! > 0 and 


T is0+ae(cos 0+i% sin 0) us is0 wT 
ée€ e 2 
—_ d0 > do = Men 1) dp 
/ 1 — e&(cos 0+isin 4) / _ COS 6+ isin 0 / € , 


—T - —T 


since the integrand uniformly converges to its limit. We conclude that 


_ »)s—-1p-az P »8—1,—ax 
| i) < dz = —2isin rs | a dz 
D(p) 1- e-? 0 1- et 


implying 
eee co ,,s—l1,—axr 
| (2) ie dz= disinns | dy 
Do) 1-e* a 
= —2isinawsT(s)¢(s,a) = —27i aes 


on the basis of Proposition 3.2 and reflection formula (3.3). This brings us 
to the following result. 


Proposition 3.3. For Res > 1, 
T(I1— _»\s—1l,—az 
¢(s,a) = ( ) u eas dz. (3.7) 
D(oo) 


271 1—e-* 


The resulting integral is a single-valued analytic function of s for all 
s €C. Therefore, the only potential singularities of ¢(s,a) originate from 
the singularities of [(1 — s), which are the points s = 1,2,..., since the 
integral provide the analytic continuation of ¢(s,a) to the entire complex 
plane with the exception of these points. At the same time, we already 
now the analyticity of ¢(s,a) in the domain Res > 1 from its defining 
series expansion (3.6). This leads us to the following. 
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Proposition 3.4. The function ¢(s,a) is analytic in C besides s = 1, where 
it has a simple pole with residue 1. 


When a = 1, this implies the analytic properties of ¢(s). 


Proof. By the argument above, the point s = 1 is the only candidate for a 
singular point. Taking s = 1 in the integral (without the gamma prefactor) 
we get the expression 

1 ere 


Qni D(co) 1 = Ee” 


dz 


which is equal to the residue of the integrand at z = 0: this is clearly equal 
to 1. Combined with (3.7) this implies 


It remains to recall that ['(1—s) has a simple pole at s = 1 with residue —1. 


Exercise 3.4. Show for Res > 0, 


(1—2'*)¢(s) => a a a [ = = de. 


n=1 


Exercise 3.5. Show for Res > 1, 


1 DP (pce grate 
8 — — = d . 
@ acs) =6(85) = fa fo ae 
Exercise 3.6. Show for all s £1, 


z 2'-sT(1 — s) (—z)s-t 
(Ol 53 (Qi-s 1) lees ae a 


where the contour D(oo) does not contain inside the points 
Tt 


»#37i, +577,.... 


Proposition 3.5 (Hurwitz). For0<a<1ando=Res <0, 


21(1 — s) TS ~~ cos 2ran TS <~ sin 2ran 
= i | : 8 
¢(s, a) (nis (sin 5 > aise cos 5 20a ) (3.8) 


Proof. Consider the integral 


1 __y)\s—-1,—-az 
/ ae 
271i Jo, l-—e* 
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where N is an odd positive integer, the contour C'y is the circle centred 
at the origin of radius Nz going counter-clockwise from Na to Na. We 
assume that arg(—z) = 0 at z = —N7. 

In the domain bounded by the contours Cy and D(N7z), the function 
(—z)*-te~**/(1 — e~*) is analytic and single-valued, except for the poles 
at +271, +477,...,+(N — 1)ri. Therefore, 


__,)\s—-1,—-az _»7)s—-1lp—az 
1 , I (—z)® te de 1 . / (—z)®te ae 
2mi Jo, l—e* ant Ip(nn) 1l-—e-? 
-3 (Ri +R, 


where Rt and RZ are the residues of the integrand at 2n7mi and 
—ni/2 


—2n71, respectively. When —z = 2nze 
(Dag ye tere Aan, so that 


, the residue is equal to 


N-1 
Ri+R, =2@nn)°-1 sin( + 2ran) forn = 1,2,..., 
We obtain 
1 (—z)s-te-% q 2sin 3 Se cos 27an 
= yo earmese kta 
2ni Jp(nn) 1l—e-? (27)!-s = ni-s 
2cos (oD? ain Qnan 1 : (—z)*-le-% ay 
2) we 2ni Io 1—e-? 


Furthermore, for 0 < a@ < 1 we can find an absolute bound |e~%*/(1 — 
e-*)| < M for z € Cy, independent of N. This means that, for ¢0 = 
Res < 0, 


1 _»)\s—l1,—az M T : 
. | CAE ale \(Nm) Se!) 40 
271 Jo de>? 2a Jn 


< M(Nr)’e"*| +0 as N- 00. 


Thus, letting N — oo in the above equality we arrive at the desired formula 
(3.8). Note the (absolute) convergence of the both series when Re s < 0. 


Theorem 3.1 (Riemann). The following functional equation is valid for 
Riemann’s zeta function: 


2!*T(s)¢(s) cos > = 7¢(1—s). (3.9) 
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Proof. Take a = 1 in equation (3.8) and apply the reflection formula (3.3) 
of the gamma function. This proves (3.9) in the domain Res < 0. Since the 
both sides are analytic in the larger domain C \ {0,1} (besides the simple 
poles at s = 0,1), the result remains valid there by the theory of analytic 


continuation. 


Exercise 3.7. Show the function T'(s/2)1~*/?¢(s) does not change under 
the involution s 4 1—s. 


It follows from (3.9) that ¢(s) has zeros at negative even integers; these 
are called trivial zeros. In his famous 1859 memoir, Riemann suggested 
that all other (non-trivial) zeros lie on the critical line Res = 1/2, which 
represents the symmetry of the functional equation. 


3.3. Zeta values and Bernoulli numbers 


One of interesting and still unsolved problems is the problem of determining 
polynomial relations over Q for the numbers ¢(s), s = 2,3,4,.... 

The first breakthrough in this direction is due to Euler, who showed 
that ¢(2k) is always a rational multiple of 72", where 


 (=1)" 
=A 
= ara 


= 3.14159265358979323846264338327950288419716939937510... . 


Although we do not follow Euler’s original method, the derivation is worth 
reproducing. 
For a € R, the Bernoulli polynomials B,(a) € Qa], where s = 0,1, 


2,..., are defined by the generating function 
ret iad 3 
= B,(a)—, 3.10 
rt cor (3.10) 
while the Bernoulli numbers B, € Q, where s = 0,1,2,..., are simply 


given by B, = B,(0). The latter means that the generating function of the 
Bernoulli numbers is 


foe) 
x ge. 
ae 5g! | 
s=0 


For example, By = 1, B; = —1/2. The polynomials and numbers satisfy nu- 
merous identities. As an example, we have the formulas B/(a) = sB,_1(a) 
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and 


N-1 
yi eats B.(N) — Bs(M) 
8 
k=M 
for s =1,2,..., and also the following ones. 
Exercise 3.8. (a) Show that 


B,(a)=>~ @ B,a®—* for s =0,1,2,.... 
k=0 
(b) Verify that B, = 0 for odd s > 3. 
(c) Verify that B,(1) = B, = B,(0) for even s > 0. 


Lemma 3.1. For0<a<1 and s=—m a negative integer, 
Bm+1(@) 
fa mM, a) m+ A . 


Proof. Recall the integral 
1 a) a ae 


a 2 
21 led Lae * T(1—s) 


from Proposition 3.3. If s is a negative integer, s = —m, the expression 
(—z)s— 1 ee 
Le-% 
is a single-valued function of z, which is analytic in |z| < 27, z 4 0. By 
Cauchy’s integral theorem, the integral over D(co) is equal to the residue 
of the integrand at z = 0, that is, to the coefficient of z~* = z™ in 


__1)s—-1,-az —1)8-1 (~2z) e-% _4\ym-1 © k 
k=0 ‘ 
It follows that 
G(=m,a) _ (8,4) — Bn+i(a) 
ms szT(1=s)],--,, (m+)!’ 


which implies the result. 


When a = 1, we get the following consequence for Riemann’s zeta func- 
tion (using also Exercise 3.8). 
Proposition 3.6. For k = 1,2,..., we have ¢(—2k) = 0 and ¢(1 — 2k) = 
Box, /(2k). 


Exercise 3.9. Show that ¢(0,a) = 4 —aand ¢(0) = —%. 
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Proposition 3.7. Fork =1,2,..., we have 


k-1 (27)?* Box 


C(2k) = (I 


Proof. This follows from Proposition 3.6 and the functional equation (3.9) 
for s = 2k. 


In particular, 


1? x 776 
C(2)= 5-3 ¢(4) 59208) ¢(6) 38. 517? 
78 710 
= 10) = ———_—_ 
(8) 2-33.52.7’ ou) 35.-5-7-117 
691712 Qnl4 
12) = 14) = 
oe) 36.53. 72-11-137 cua) 36 .52.7-11-137 
and so on. 


Proposition 3.7 gives us a ‘closed form’ expression for the values of the 
zeta function at even integers in terms of 7 and the (rational) Bernoulli 
numbers. No similar formulae are known for the values at odd integers. In 
Chapter 7 we touch questions about arithmetic nature of zeta values — the 
values of ¢(s) at integers s > 2; see there Conjecture 7.1. 

The difficulty of proving that the ‘odd’ zeta values ¢(3), ¢(5),¢(7),... 
are algebraically independent with a over Q serves a motivation to intro- 
ducing a multidimensional generalization of Riemann’s zeta function. For 
positive integers s1,S2,..., 8; with s, > 1, consider the values of the multi- 
ple (i-tuple) zeta function 


¢(s) = (81, 82,-.-,81) = S- ge at (3.11) 
ny >ne>->ni>1 

the corresponding multi-index s = (s1,82,...,8;) will be further regarded 
as admissible. The quantities (3.11) are called the multiple zeta values 
(and abbreviated MZVs), or the multiple harmonic series, or the Euler 
sums. The sums (3.11) for | = 2 were first investigated by Euler, who 
obtained a family of identities connecting double and ordinary zeta values. 
In particular, Euler proved the identity 


¢(2, 1) = ¢(3), (3.12) 
which was several times rediscovered by others later. 


Exercise 3.10. Find your own (elementary) proof of (3.12). 
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The following exercise discusses g-deformations of (multiple) zeta values 
(see also Section 1.3). 


Exercise 3.11. Let ox(n) = an d* the sum of the kth powers of the 


divisors of n. In this exercise we assume that q is a complex parameter 
from the unit disk |g| < 1. 


(a) Show that o;(n) is a multiplicative function (see Section 2.4 and com- 
pare with Exercise 1.1). 
(b) Prove that 


Ly on(n)q” = se l-a@ 
n=1 n=1 q 
(c) Prove that 
foe) 2 fore) gq” 
douln)a" = Das 
n=1 n=1 (1 vee ) 


and deduce from this that 


lim(1— 4)? )/ on(n)a" = ¢(2). 


n=1 
(d) Prove that 
S 5 o2(n)q = mn G 
n=1 n=1 q 
and deduce from this that 
: _ 7\3 nr 
lim (1 —q) Dd, o2(n)a 2¢(3). 
(e) Demonstrate that 
XO eR ye 


and that the limiting case as g > 1 of this identity after both sides are 
multiplied by (1 — q)? is precisely Euler’s relation (3.12). 
(f) Generalise identities from parts (c) and (d) to the form 


[oe) [oe) P, n 
Dela" = d ae 
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3.4 Analytic continuation of multiple zeta function 


In this part, we discuss analytic properties of the multiple zeta function 
(MZF) 


1 
(8) = os Aen et (3.13) 
ny>ng>>nj>1 1 2 l 


as a function of complex variables s),...,s;; the notation o1,...,0, will be 
used for the real parts of s1,..., 87. 


Exercise 3.12. Show that the multiple series in (3.13) converges absolutely 
in the domain 


oi, +-:-+o0; =Re(s1+---+8;) >j for every j =1,...,1. 
Conclude from this that the MZV is analytic in each of its variables in the 


domain 0, +---+0; >j, where j = 1,...,1. 


Hint. Use mathematical induction on | and estimates 


eer 
n? ~ (9 -1)Me-l’ 


n>M 


where M > 1 is integral and a > 1 is real, coming from the integral test 
(when the partial sums of a series are compared to Riemann sums). 


Lemma 3.2. For 0 <a<1 and an integer m > 2, 


S 


a e27ina 7 Bm (a) 
» ( 


Qrin)™ m! 


where the dash in summation corresponds to omitting the (problematic) 
index n = 0. 


Proof. Comparing Hurwitz’s equation (3.8), 
¢(s,a) 2 s sin(7s/2 + 27an) 


i ni-s 


T(l1—s) (2n)!-s 


for s = —m +1, with the result of Lemma 3.1, 


Bn(a) __¢(s,0) 
m! T(1—s) 


> 
s=—m+1 


we find 


Brn (a)  sin(—1(m — 1)/2 + 27an) 
m! aa (27n)™ , 
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which is exactly 
oe ¢ CO 2rina _ ,—2rina 
(-1)" ps eee = a: - Ok i 
(2an)2k+ (2rin) 2+ 
n=1 n=1 
e2tina ea e72rina 


Co 
= | 7 
d, (Qrin)?k+1 d, (—2rin)?*4+1 
n=1 t= 


or 


27ina ae eo 27ina 


<= (2nn)? - = (2Q7in)?* 


yD 2 cos zl ee 


e2tina oS —2rina 


e 
= “6 (Qrin)2* | 2d (—2nin)?* 


depending on whether m = 2k + 1 is odd or m = 2k is even. 


Lemma 3.3. For0<a< 1 and any integer m > 2, 


4m! 
(2m)m 


|Bm(a)| < 
Proof. It follows from Lemma 3.3 that 


2m! ¢(m 
|Bm(a)| < ml Sar ~ ae 2 


neZ 


It remains to apply the trivial estimate ¢(m) < ¢(2) = 77/6 < 2. 


For the statement and application of the following classical result, it 
will be convenient to introduce the periodic Bernoulli polynomials given by 
Bm(a) = Bm({a}), where {-} denotes the fractional part of a real number. 
By Lemma 3.3 (and Exercise 3.8) we get the estimate 


| Bin (a)| < 


now valid for all real a. 
Exercise 3.13. Verify the validity of (3.14) for m = 0,1. 


We will also implement the (standard) notation 


4m! 
CAL form = 2,3,..., (3.14) 


(8)m = (3.15) 


Pis+m) _ Js(s+1)---(s+m—1) ifm=1,2,..., 
I'(s) 1 ifm =0, 


for the Pochhammer symbol, though it makes sense for any (not necessarily 
integer or non-negative) m. For example, (s)_; = ['(s—1)/T(s) = 1/(s—1). 


48 Analytic methods in number theory 


Proposition 3.8 (Euler—Maclaurin summation). Let f(x) be a (complez- 
valued) C° function on the real interval [1,00). Then for any positive 
integers N and m, m even, 


= ag 1 a By fila Yin (k-1) 
= fp Fleyan + 59) + £00) +o FE (FP) — FP) 


N ~ 
a Bm(ax) f(a) dx 


Notice that the sum over & in the formula only involves k even, because 
By, = 0 for odd k > 2. 


Lemma 3.4. Given s © C with Res > 1, for integers M > 1 andm > 2, 
m even, we have 


es “5 (Sm. of Brn (2) Ave 
ns k! pare = m! Jue astm 


n>M 


Proof. Apply Proposition 3.8 with f(x) = 1/x* twice: when N — oo and 
when N = M. Taking the difference of the results we arrive at 


Les - 5 v2 


no | 
P™ sea) an — 5 #00) — >t Fea 
k! 
k=2 
: ah mn(2) FO (a) da 
-JM 
5 i 1 eS By. Bera (Sine is By (2) ‘ 
(s—1)Ms-! 2Ms 8! Mstk-1 m! Jy etm 
which can be written in the desired form because Bp = 1 and B, = —1/2. 


Exercise 3.14. Use Lemma 3.4 (with M = 1, say) and the estimates of 
Lemma 3.3 to show that Riemann’s zeta function can be analytically con- 
tinued to the half-plane Res > —L for any real L > 0. 


Introduce the following discrete subset of C!: 
= {s € Cc 2S, € {1}, 8, +52 € {1,2} U 2Z <0, 
syt-+-+5; €Z<; forg = Shakil 
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The following general result provides the analytic continuation of the MZV 
¢(s) to a meromorphic function on C! with (at most) simple poles given 
by yy. 


Theorem 3.2. Assume 1 > 2. Then for any s = (s1,...,8,) € C!\ S) and 


an evenm > 1+ |oi|+---+]|oi|, we have 


=F ($1) K— 1° ¢(s1 +s. +k-1, §3,...,8 $1) 


(81)m 1 Beta) 
= ~ aaa] Sew (3.16) 
n2>-->nj>1 2 l n2 


Proof. The absolute convergence of the second series in the formula (3.16) 
follows from the estimate 


4m! ie dx 4m! 
N 


= (nym Jay wetm = (Q0)2"(m —1+40)M™-Ho’ 
where o = Res, implying 


3 1 © B(x) | 


82,83 SI sitm 
ng>>nj>1! 2 3 1 “ne 


[3 M ml) 


4m! 1 


= (7 2m Wied o m—1+o1+02,,03 . | Ol 
( ) ( 7 1) ng>-->ni>1 uD) n3 ny 


For the latter sum we use 


1 
— ae Fal haere . nol <n eave lealeale als +lou| 
ns M3291; 
and the fact that the number of integers n3,...,n; satisfying no > n3 > 


- > nj, > 1 is bounded above by te (because each n, satisfies 1 <n; < 
m2), so that 


o1|+tloe|+lo3|/+---+lo, i 
1 nk 1|+|o2|+|o3| | Int 2 
} m—1+o,+02. 03 CE = m-1 
ng>>nj>1 "2 3 sa) ng>1 m2 


converges when m > 1+ |oi|+--:+ ||. 
Now, to get the formula (3.16) we apply Lemma 3.4 with s = s1,n =n, 
and M = ng, and then perform the summation over ng > n3 >--: > mj > 1. 
It remains to carefully control the (potential) poles by induction on 1. 


Exercise 3.15. Show that the potential poles of ¢(s) at s € X, are at most 
simple. 
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Hint. Notice that the second (multiple) sum in (3.16) is analytic, so that 
the only source for poles comes from 


B 
= ($1)k—-1 -C(s81 + $9 +k- 1, 53,..., 81). 


Use mathematical induction on / and the fact that ¢(s) (when / = 1) has 
one simple pole at s = 1. 


Chapter notes 


Though the Whittaker-Watson textbook [80] remains our principal recom- 
mendation for treatment of basic special functions, the book [77] is a min- 
imalistic alternative, especially if planned as a real-life university course. 
When it comes to the Bernoulli numbers and polynomials and their connec- 
tion with zeta functions of various types, there is a way more to say —the 
book [5] is a tremendous source on this topic. 

The multiple zeta values and their generalisations have received a very 
special attention and are under extensive studies during the last decades, 
in connection with problems of not only number theory but also of combi- 
natorics, algebra, analysis, algebraic geometry, quantum physics, and many 
other branches of mathematics. This is a topic of its own, with many re- 
sults and challenges left; the interested reader is advised to consult with 
the books [20,87]. 


Chapter 4 


Continued fractions 


4.1 Euclidean algorithm and continuants 


Let a,b € Z with a > b > 0. Defining r; = b, consider the following 
successive application of division with remainder: 


a= ayr, +12, O0<re<1r1, 


Ty = Garg + 73, O0<r3 <re, 
(4.1) 


Tn—-2 = An-1Tn-1 + Tn, O<Tn < Tn-1) 

Tn—-1 = AnTn + 0. 
Then the last non-zero remainder r,, is the greatest common divisor of a 
and b. 


Critically, the procedure (4.1) terminates at some step in view of the 
following chain of inequalities: 


b=7r, > 12 >+++ >Tn-1 >Tn > 0. 


Also observe that on the last step we get a, > 2, as otherwise (a, = 1) we 
would have r,_1 = rp contradicting ry_1 < Tn. 

By applying consequently the steps of the Euclidean algorithm we de- 
duce the representation 


a T2 1 1 


=a | =a | =-.-=@Q { ; 4.2 
b ty , ry/T2 * idl 1 ee) 
24 
a3 2 ee 1 
ea 
an 
where a@1,...,@n € Zso (with a, > 2) are the partial quotients of the 


finite continued fraction for a/b. Notice that the intermediate divisors 
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19,--+-+,;Tm do not appear in the continued fraction representation, so that 
we can restrict ourselves to the case of a and 6 relatively prime, (a,b) = 1. 
Furthermore, the representation (4.2) with a, > 2 is unique for a/b > 1 
rational, because it is in a one-to-one correspondence with the steps in the 
Euclidean algorithm. 

Consider now ay, @2,@3,... as unknowns (variables) and define polyno- 
mials pp = Pr(G1,.--,@n) and dn = Gn(G1,.--,@n) as follows: pi(a1) = a4, 
qi(a,) = 1 and 


Pn(@1,@2,---,@n) = @1Pn—1(@2,---,@n) + In—1(G2,---, Gn), (4.3) 
Gn(@1, A2,-++,@n) = Pn—1(42,---,@n) , 
for n = 2,3,.... One can also start (4.3) from n = 1 by setting po() = 1 
and qo() = 0. 
Lemma 4.1. For n =1,2,..., we have 
a1, 42,...,@ 1 
Dnl 1 2 n) =a,+ i 
Qn (G1, G2,--+,@n) a2 } 
ag+. 1 
a 
an 


Proof. This follows by induction on n from noticing that p1(a1)/qi(a1) = a1 
and 


Pn (Q15425+++54m) _ axPn—1(aa,-+-5@n) + gn—1(a2,--- sn) 
Qn (1, 42,---,4n) Dn—1(G2,---+,@n) 
a, 4 : 
=aic : 
Dn—1(@2,---;@n)/Gn—1(@2,---,@n) 


Lemma 4.2. We have the matrix identity 


( aan) a ie ; oe) 
Qn(a1, 42,..+, Gn) TOY \ dated aa Vy. 


Proof. This is just another way to write (4.3). 


Iterating the identity of Lemma 4.2 we obtain 
Dn(@1,42,---,4n)\ far 1\ fag 1 — (an-1 LYN (an 
Qn(@1,02,-..,@n)/ \1 0 1 0 1 O 1 
_ fa 1 a, 1 An—-1 1 a, 1 1 
~\1 0 1 0 1 O 1 0/7 \O/- 


Using this we arrive at 
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Lemma 4.3 (key identity). For py = pn(ai,..-,@n),; In = Qn(G1,---,@n), 
Pn—1 = Pn—1(Q1,---;@n—-1) and qn—1 = Qn—1(G1,---,Qn—1), we have the 
matrix identity 


Pn Pn-1\ _ [4% 1 ag 1 Gn—, 1 an 1 
Qn Gn-i) \1 O 1 O 1 0) 1 O/° 
In particular, this implies 
( n ee) = ie ee) & _ 
Qn Yn-1 Qn-1 YIn-2 1 0 ; 
hence for the first column on the left-hand side 


Pn = QnPn—-1 T Pn-2; 


Gn = Andn—-1 T In-2; 


for n = 2,3,4,..., where all the polynomials involved depend on aj, a2,... 
(without the shift!). 
Finally, we call the polynomial 
C(a1, d2,---,@n) = Pn(@1, d2,.--, Qn) 
the continuant on variables a1,...,@n. (We also set C() = 1 for the empty 
set of variables.) It follows from (4.3) that 
Qn(a1, esse , Qn) _ C(az, aa) An), 
so that Lemma 4.1 reads 
C(a1, a2,---,@n) 1 
— ay } 
C(a2,.:., Gq) I 


Furthermore, the key identity assumes the form 
ES @2,+-+;An—1,4n) C(a1,a2,... -) 


C(ag,..-,;@n—1; Gn) C(ag,..-,;@n—1) 
_ fa il az 1 Qn-1 1 a 1 
“GAT ot ae 
and we have the followings properties of the continuant. 


Lemma 4.4. For n = 2,3,... and a; € Zso, we have 


(a) C(a1,@2,..-,An) = C(an,..., 2, 41); 
(b) C(a1,..-,@s,Qs41;---)@n) = C(ai,..., as) C(@s41,---, Gn) 
+ C(a1,..-,@s—1) C(ds42,---,@n). 
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We can refer to property (a) as reflection, and to (b) as reduction. 
Proof. For (a), use the matrix transposition of the key identity (4.4). For 
(b), write the identity in the form 


oo Ss ts Cae oe 
C(ag,..-,@n) C(aa,..-,@n—1) C(ag,...,@s) C(ae,...,@s—1) 


. ene ee 


C(dg42,-+-;An) C(s42,---,Qn—1) 
and read off the (1, 1)-coefficient of the matrix on the left- and right-hand 
sides of the result. 


Exercise 4.1 (H.J.S. Smith). Show the following determinant expression 
for the continuant: 


a 1 0 =::: 0 0 
—1 ag 1 seem 0 0 
0 -1l ag -::: 0 0 
C(a1, a2, 43,...,@n) = det . . . : 
0 0 An—1 1 
0 O -l a 


4.2 Primes as sums of two squares 


In this section we witness an application of continuants to the following 
classics. 


Theorem 4.1 (Fermat, Gauss). Any prime p = 1 (mod 4) can be repre- 
sented in the form u2 + v2, where u,v € Zso, and this representation is 
unique. 


Proof. Existence. Write our prime p = 4r +1 and, for each number pz € 
{2,3,...,2r}, run the Euclidean algorithm (4.1) for the fraction p/w: 


a C(a1, d2,-.-,@n) 


ML C(a2,...,@n) 
Note that 2 < p/u < p/2, so that a; > 2, and from ged(p, 4) = 1 we get 


p=C(a1,02,...,dn) and p=C(ag,...,an). 
Observe that p = C(@n,Qn-1,.-.,@1) with a, > 2 and take vy = 
C(an—1,---,@1). It follows then that 
p _ C(an,Qn-1,.--, 1) 


V C(dn—1,---,@1) 
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hence v € {2,3,...,2r} as well. This defines an involution © v on the 
set A = {2,3,...,2r}, because applying the construction to v we will get wu 
from it. The set A contains odd number of quantities, so that for at least 
one pt from the set we should have v = ys. For such ps we thus obtain 


CSOpmate pe Cas ep 
Because of the uniqueness of representation of p/j by a continued fraction 
(in fact, the uniqueness of the Euclidean algorithm for the pair p > > 0), 
the latter is only possible if ag = an_1, a3 = Gn_2, .--, Gn = 41; in other 


ONGig Goce Gn) Bs CpG thee5an) 


words, when p = C(a1,d@2,...,@2,@1) has representation as a palindromic 
continuant. Consider now two possibilities: n is odd, and n is even. 

If n = 2s —1 for some s > 2, then p = C(a1,...,@s—1, Qs, @s—1,---,@1), 
and Lemma 4.4 implies 
p=C(a1,..-,@s—1, ds) C(as—1,---, 01) + C(a1,...,@s—1) C(as_2,.--, 1) 

= (C(a1, ..+3Q@s—1,4s) + C(ar,... ,As—2)) C(a1,...,@s—1) 
meaning that the prime p is divisible by the integer C(a1,...,d@s-1) > 
a, > 2, a contradiction. 

If n = 2s for some s > 1, then p = C(aq,...,@s,@s,.--,@1), and 
Lemma 4.4 implies 

p=C(a,...,@s) C(ds,...,a1) + C(a1,...,@s—1) C(as—1,---, @1) 
= Cig i syae + Ci ota: 


the required representation p = u? + v?. 


Exercise 4.2. Let p = 4r +1 be a prime. Then there are exactly 2r dis- 
tinct representation of p as continuants C'(a1,...,@n,) with the first and last 
entries @1,Q, > 2. 


Hint. Use distinct representations coming from 
p _ C(ai,4a2,...,@n) 
Mb C(ag,..+5 an) 


when pw € {1,2,...,2r}. 


Lemma 4.5. For n = 2,3,..., 


C(a1, 2,---,An—1,; An) C(a2,..-,An—1) 


— C(aj, d2,...,An—1) C(a2,...,@n—1, Gn) = (—1)”. 


Proof. Simply compute the determinants on the both sides of (4.4). 
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Theorem 4.2 (Euler). Given a prime p = 1 (mod 4), there is a solution 
xo to the equation x? = —1 (mod p) with 1 < x2 < p/2. In addition, the 
two different residues +a (mod p) exhaust all solutions to the equation. 


Proof. We use the palindromic representation p = C(a1,...,@s,@s,---,@1) 
with a; > 2 found in the proof of Theorem 4.1. Define 7p = C(aa,...,ds, 
@s,---,@2,@1), so that 


Pm SOG ia wats Ogee ass 23 er) ge 


a0) 7 Cag, ..+, Gg, Mg,.++; 2,04) 
and 1 < ao < p/2. It follows then from Lemma 4.5 applied with n = 2s 
and Lemma 4.4 (a) that 

l= (-1)”" = C(a1,. -+5Qs,Qs,--- yO) C (Ga, 265 Caja ee ig Ga) 
— C(aj,...,@s,@s,---,@2) C(ag,...,Gs,Gs,---, 1) 
= PO (dsiyn.s vGgytiyawe ya) A: 
Restricting this equality modulo p leads to 73 = —1 (mod p). Clearly, both 
zo (mod p) and —ao (mod p) are solutions to 7 = —1 (mod p), and the 


quadratic equation does not possess more than two solutions in the field 
F, = Z/pZ. 


Proof of Theorem 4.1. Uniqueness. Assume that there are two representa- 
tions p = u2 +. v2 =u” +.v” of the given prime p = 1 (mod 4), with u <v 
and u’ < v’. Run the Euclidean algorithm on the rational numbers v/u > 1 
and v’/u’ > 1 to get the corresponding representations 


ny Olea is. coset) of Olah, ah,.--,04) 
= and — ; 
u C(ag,...,@s) ul C(as,..., a4) 
with a, > 2 and a, > 2. From Lemma 4.4 we deduce that 
p=vtw 
= C(dg,..., 42, 41) C(a1, d2,...,a3) + C(as,...,@2) C(a2,..., G5) 
= O(ds,..-, 42,41, 01, d2,..-, As) 


and, similarly, 
2 p22 Cla’ b> phe eid a9 ! 
p=v +u =C(aj,...,@2,41,41,49,.--5 G4). 


By the proof of Theorem 4.2 we know that both C(as-1,...,q1, 
@1,---,@s—1,@s) and C(a}_j,-..,@4,@4,---,@4_,,a}) are solutions to x? = 
—1 (mod p) in the range 1 < x < p/2, so that they must coinside: 


i, i / / if, 
pb = C(ds-1,---,1,01,--+,;As—1, 4s) = C(ay_1,...,@4,@4,---,@:_1, @). 
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It follows then from the uniqueness of the Euclidean algorithm for the pair 
p > pw > O (equivalently, of the continued fraction for p/j) that in the 
equality 


DP . ClagyGg-45 4+. (Oi pG15 +215 Gg-1, Gs) 
ie “Chat. nn ti tigen oa Ge) 
7 Cay, Q_1, +++, @,, G1, +--+, @_4, 4) 
Cay 15-054, G+ M1, G4) 


we have t = s and ai = a; for all i. 


4.3. Continued fraction of a real number 


It is now a good moment to introduce a compact notation for the finite 
continued fraction in (4.2): 


[a a a ] = aA 1 C(a1, G2, , Qn) 
1,42,--+,;@n] = a1 4 
’ r 2 1 
ag + C(az, ee) 
agt+ . 1 
hoe needs 
an 
This is clearly a rational function of variables a1, a2,...,@n, involved. At the 


same time, the expression originates from applying the Euclidean algorithm 
to a rational number a = a/b as in Section 4.1. The procedure can be 
alternatively interpreted as follows: take a, = |a| and, if a is not an integer, 
then write it in the form a = a; +1/a2, where a2 > | is again a rational 
number. Inductively, we choose a, = |Q@,| and Qn = apn + 1/an41 with 
Qn+1 > 1 if a, is not an integer. The procedure terminates at some step 


(that is, eventually we get an integer a, = a, > 1), so that a = [a1,..., ap]. 
If we discard the condition a, > 1 for n > 1 then the number a can be 
also represented as a@ = [a1,...,@, —1,1]. This fact is sometimes useful for 


manipulating the parity of a particular length of a finite continued fraction. 
The recursive algorithm above extends to the case of an irrational num- 

ber a with no trouble; however, at each step we obtain irrational a, > 1, 

so that the continued fraction cannot be finite. We will use the notation 


= [a1,d2,..-,4n,---] 
for this infinite case. Observe that the procedure implies that a = a; and 
a= (a1, 42,--+,4n; Qn+1] 


for each n > 0, as well as adn < Qn < Gn +1 for n > 1 and a, > 1 for 
n > 2. Furthermore, by truncating the infinite continued fraction at the 
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nth step, we get the pair of relatively primes integers p, = C(a1,d2,..-,@n), 
Qn = C(ag,..-,Qn) such that [a1,a2,...,@n] = Pn/dn. Herel =q <q@< 

- <n < ++: because of the recursive formulae for q,. The fraction 
Pn/Qn = [Q1,42,---,@n] is called the n th (principal) convergent of a, while 
Gy is the nth partial quotient. 


Lemma 4.6. For any n > 2, 
Pn — Pn-1 _ (—1)” 
dn Qn-1 QnQn-1 


Furthermore, for n > 3, 


Pn Pn-2 _ (—1)""1a, 


dn Qn—2 Gn QAn—2 


n 


Proof. (a) Write the equality of Lemma 4.5 as pngn—1 — Pn—1gn = (—1) 
and divide both sides by gndn_-1.- 
(b) Similar to the proof of Lemma 4.3 we obtain 


Pn Pn-2\ _ [1 1 a2 1 fe An—2 1 An—14n + 1 an 
Qn Qn-2/ \1 O/\1 O ie 2) 1 0)° 


Passing to the determinant we arrive at Pngdn—2 — Pn—2dn = (—1)"~ tan. 
Finally, divide both sides by gndn_2.- 


The second equality of Lemma 4.6 implies the following. 


Lemma 4.7. If az,a3,... are positive (not necessarily integer!) numbers 
then the sequence Pn/Qn restricted to odd n is strictly increasing, while 
restricted to even n it is strictly decreasing. 


Lemma 4.8. For n> 0, we have the equalities 


(eye (—1)"* ante 
Gn41@ — Pay. = —————— and Gna pn = : 
On+29n+1 + dn An+29n+1 + dn 


Proof. Write the equalities in Lemma 4.6, after a shift, in the form 
Pn+2 Pn+1 _ (—1)” me ae 


Qn+2 Qn+1 7 Qn+29n4+1 (@n429n41 - Qn)Qn+1 
for n > 0 and 


Pn+2 Pn _ (—1)"*an4o = (1) Gite 

Gnt2. nn $2Gn (ng 2041 + On) On 
for n > 1. Now specify the variables a), @2,...,@n+41,@n+2 involved to 
G1, 02,---,4n+41,An4+2 corresponding to 


a= [a1,02,---,@n,---] = (41, d2,---, Qn; An41, An+42| 
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and use the fact pr+2/dn+2 = a after the specialisation. Finally, observe 
that the second formula is also true for n = 0 in view of qo = O and 
Po=H =1. 


Theorem 4.3 (monotonicity and estimation of convergents). For odd n, 
the nth convergents of a form a strictly increasing sequence converging to a; 
for even n, the nth convergents of a form a strictly decreasing sequence 
converging to a. Furthermore, for n > 1, 
1 1 
Z a Pn 


1 1 
< < 
24nQn+1 dn (Gn - Qn+1) dn 


GnQn4+1 ~ Gn+192 


Proof. By making a reference to Lemma 4.7, we only need to explain the 
estimates. Because a is always located between two consecutive convergents 
Pn/M and Pn41/dn41, we deduce that 
— 1 . 

Gn+19n 


Pn 
Qn 


Pn+1 Pn 
Qn+1 dn 


Qa < 


in addition, dn41 = Gn41dn + Qn-1 > An4idn for n > 1. Similarly, by 


locating a with respect to pp /dn and pn42/dn+2 we obtain 


Pn Pn+2 Pn 
dn Qn+2 dn 


since Ayj42 > 1. 


An+2 — An+2 il 


a = 
Qn+24n (Qn+29n41 + In)an (Qn+1 + dn) dn 


> 


Since gn+1 > Gn, we conclude that the principal convergents py /dqn 
satisfy the inequality 


Exercise 4.3. Prove that 


pop, 


nae InIn-1 


Hint. Use Lemma 4.6 to show that 


Pn =a,4 S- (—1) 


qn = WG-1 


and then apply the convergence of p,,/qn to @ as n — oo. 


Exercise 4.4. Define Sp = 2 and Sn41 = S2 — S, +1 for n > 0; this is 
Sylvester’s sequence. 
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(a) Using Exercise 4.3 show that the partial quotients in the continued 
fraction expansion of 


ee 
C — 
are all squares. 
(b) Prove that 


n=0 
has a continued fraction [a1,a2,...,@n,...] where for n > 3 each a,,/2 
is a square. 
(c) Finally, show that 2C = C’ +1. 


(The number C = 0.64341054628... is sometimes called Cahen’s constant.) 


4.4 A taste of diophantine approximation 


Lemma 4.9. Let pp—i/dn—1 and pn/dn be two successive convergents of 


an irrational number a = [a,,a2,...]. Then at least one of these fractions 
satisfies the inequality 
1 
2) 2. 
| i 29° 
Proof. Assume, to get a contradiction, that 
Pn-1 Pn 1 
Qa >> and Jja——|> oe 
Qn—-1|— 2n-1 dn| 29h 


Using the fact that a lies between py_—1/dn—1 and pn/qn (Theorem 4.3) we 
obtain 
1 _ | Pn-1 _ Pn Pn ae 
Qn-19n |Gn-1 In Gn|~ 2q2_, 22 
This contradicts the inequality ry < (x? + y?)/2 for « > y > 0, applied 
with « = 1/qn—1 and y = 1/dn.- 


Qa 


dn-1 


Our next statement shows that, in a certain sense, the converse of 
Lemma 4.9 holds as well. 


Theorem 4.4 (Legendre). Let p and q be coprime integers, q > 0, and let 


Then p/q is a convergent of a. 
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Proof. Write the continued fraction expansion of the rational number 
p/q: p/a = [a1,42,---,@n]. Let pa—i/dn—1 and p/q = Pn/dn be the 
last two convergents of this expansion, where we assume that both a 
and Py—1/d,—-1 are simultaneously greater or smaller than the number 
p/q (if this does not happen then we replace the continued fraction with 


p/q = (a,.--,@n—1, @ — 1,1]). Consider the number 
Pn-1 — @n-1 
B=- =e 4.5 
Pn _ adn ( ) 
for which we have 
2 \ Qn—-1 = Pn-1 — OGn-1 i dn-1 
dn dn — Pn dn 
1 1 
— 2 = 5) > 2, 
q2\a — Pn/dn| qd la — p/q| 
implying 
(| > s 4 dn-1 dn-1 7) Qn-1 > 1. (4.6) 
dn dn In 


Comparing the latter inequality with (4.5) we deduce that 
[Pn—1 — 0Gn—1| > [Pn — O9n|; 


hence 


On| 


However, the numbers a and py_1/gn—1 are both either greater or smaller 
than p/q, that is, a lies between py_1/dn—1 and p/¢ = Pr/gdn. But then 
(@ > 0 in accordance with (4.5), and so 8 > 1 by (4.6). 

Let [dn41,@n42,---] be the continued fraction of 6; we have aj41 > 1 
in view of 8 > 1. Then 


[a1,---,;@n,Qn41,---] = [Q1,---, Gn, 8] = a; 


in other words, p/¢ = Pn/n is indeed a convergent of a. 


4.5 Equivalent numbers 


The set of matrices 
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with integer entries a, b, c,d and determinant +1 (that is, ad—bc = 1 or —1) 
is a multiplicative group with identity (neutral) element 


(9) 


Indeed, the product of any two such matrices and the inverse of such a 


matrix again has integer entries and determinant equal to +1. This group 
is known as the general linear group (over the ring Z) and is denoted by 
GL»2(Z); in what follows we reserve the notation [ for this group. 

For an irrational number a, the action of an element y € I is defined 
by the rule 
aa+ b 
ca+d 
Exercise 4.5. Show that the action is well defined, namely, that Ea = a 
and y(da) = (y6)q@ for all 7,6 €T. 


We say that two irrational numbers a and £ are equivalent if ya = 6 
for some y ET. 


yo = 


Exercise 4.6. Show that this relation is indeed an equivalence. 
For an irrational number a we have the representation 


ay —_ PnOnt+1 T Pn-1 
= [O50 5 0,044] = 
GnQn+1 7 In-1 


in accordance with Lemma 4.3. Define the nth continued transformation of 
the number a by the equality 


n 
— (Pn Pn-1\ _ aj _) ; 
(i ia) Il a 0} ' 
note that y, € [ from computing the determinants in the latter product. 
Then a = Y¥nQn41, and hence a is equivalent to a, for any n > 1. In other 
words, all complete quotients a,,n = 1,2,..., are equivalent to each other. 


The following theorem characterises the situation considered in this ex- 
ample. 


Theorem 4.5. Let a,8 €R\Q and 


for some y= c | ef. 
c d 


Assume that 8 > 1 andc>d> 0. Then b/d and a/c are two consecutive 
convergents of a, Say, Pn—1/Gn—1 and pn/dn; furthermore, B = An41. 
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Proof. Note that a and c are relatively prime since ad — bc = +1. Write 
a/c as the finite continued fraction 

a 

= = [an,.--5@n—a) an] = =, an > 1, 


where a = py, and c = qn. Increasing by 1, if required, the length of the 
continued fraction for a/c (namely, replacing a,, with a,,—1, 1 in its record), 
we obtain the equality 
PnQn-1 — InPn-1 = €, 
where € = ad — bc. Then 
ad — bc = pnd — dnb =€ 


and comparing the two equations we deduce that 


Pn(d — dn—1) = dn (b — Pn—1)- (4.7) 


Since p,, and gq, are coprime, we conclude from (4.7) that q,, divides d—qn_1; 
but dn-1 < q and 0 < d < c = qn, that is, |d — dn—1| < dn, so that 
d— Qn—1 = 0. Then (4.7) implies that 6 — p,_, = 0. Therefore, 


aw BHD _ PnB + Pmt 

cB +d dnb + dn-1 
By the hypothesis 8 > 1, so the resulting expression is the continued frac- 
tion representing the number a and we have 8 = ayj+1. This means that 
b/d and a/c are consecutive convergents of a. 


a [a1,.-.,@n, BI. 


Theorem 4.6 (Serret). Two numbers a, 8 € R\ Q are equivalent if and 
only if there exist integers n,m > 1 such that an = Bm. In other words, a 
and 8 are equivalent if and only if their continued fractions are 


Q = [a1, d2,...| and B = [bi, be,...] 
and Gn = bn+4. for somelE Z and alln=>N. 
Proof. First assume that for some n,m > 1 we have a, = Bm, that is, 


a= [a1,.-.,dn-1, An], B= |btyios9 Omnis Pind 


and that a, = 6m. Since a is equivalent to a, and {3 is equivalent to 3,, 
(as we have already seen), we conclude that a and £ are equivalent. 
Conversely, suppose that a and ( are equivalent, that is, 


= =O, ad — be = +1. 
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Changing, if necessary, the signs of all entries of 7 to their opposites, we may 
assume that ca+d > 0. Let y,_1 be the (n—1)th continued transformation 
of a; thus @ = Yn_1Qpn. Then 8 = 7yy¥n_1An, and 


ils shes & ttbqn—1 G@pn—2+ ban ) = & : 
oT CPn—1 + dgn—1  CPn—2 + dqn—2 cd dl) 


We have 


CPn—1 + dgn—1 = Gn-1 (B= - a) Cy 
dn-1 (4.8) 


Pn-2 
CPn—2 + ddn—2 = In 2(c “ +4) =d. 
n—-2 
Take n large enough that both pp—2/qn—2 and Pn—1/qn—1 are close to a; 
the parity of n is chosen depending on the sign of c to have 


Pn-2 <ca< opal 


Qn—2 Qn-1 1 


Cc 


Then c' > 0, d’ > 0 and, in addition, a, > 1; from (4.8) we have c’ > d’ 
as dn—2 < Gn—1- Thus, all the conditions of Theorem 4.5 are fulfilled, and 
we conclude that a, = Bm for some m. This completes our proof of the 


theorem. 


4.6 Continued fraction of a quadratic irrational 


Let d be a positive integer. It can be seen that the set {x+yVd :x2,y €Q} 
forms a field. In what follows, we assume that this field does not coincide 
with Q, in other words, that d is not a perfect square. Moreover, without 
loss of generality we may assume that the number d is square-free (that is, 
d is not divisible by a square > 1). 

Note that 1 and Vd are linearly independent over Q (otherwise vd 
would be rational). This implies that each element of the field possesses a 
unique representation in the form x + yvd with x,y € Q. Let this field be 
denoted by Q(Vd) and define the conjugate of a number a = x + yVd to 
be a= a — yVd. 

Exercise 4.7. Verify that 


a+B=a+B and aB=a~p. 
Now define the trace and the norm of a number a € Q(Vd) by 


Tr(a) =a+a=2r€Q, N(a)=aa=27-dy’*eQ 


Continued fractions 65 


Then a and its conjugate @ are the roots of the quadratic polynomial 
(2 — a)(24 —@) = 2? — Tr(a) + N(a) 
with rational coefficients; this characterises a as a quadratic irrational. 
Thus, a defining equation for the quadratic irrational a can be written 
in the form 
a? — 2x20: + (2? — dy”) = 0; 
taking x? — dy? = c/a and —2x = b/a, where a,b,c are coprime integers 
and a > 0, we can represent the quadratic equation as 
ao” +ba+e=0 


with coprime integers a,b,c, a > 0. Such a,b,c are determined by a 
uniquely. Finally, define the discriminant of a quadratic irrational a by 
the formula 


D(a) = 0? — 4ac = 4a7y"d. 


Since a is a real irrational number, we have D(a) > 0. 

We shall call a a reduced quadratic irrational if a > 1 and -1<a<0 
(equivalently, —1/a@ > 1). 
Exercise 4.8. If a@ is a reduced quadratic irrational, show that —1/@ is 
reduced as well. 


Theorem 4.7. For a given positive integer D, there exist at most finitely 
many reduced elements of the field Q(V/d) whose discriminant is equal to D. 


Proof. Let a be a reduced number having discriminant D(a) = D. Then 


—b D —b—eVD 
OOM sy day oie? VD 5 (4.9) 
2a 2a 
where ¢ = 1 or —1. If ¢ = —1 then we obtain a < 0, which is impossible. 

Therefore « = 1 and, in accordance with a > 0 and (4.9), 
b+ VD <2a<—b+VD. (4.10) 


This means that b < 0; furthermore, the second inequality in (4.9) implies 
—b < VD. From these bounds on |b| we conclude that there are finitely 
many possibilities for the quantity b to satisfy the inequalities (4.9). In 
turn, the inequality (4.10) retains only finitely many possibilities for the 
quantity a > 0 as well. Finally, the quantity c € Z (if it exists) is subject 
to the relation b? — 4ac = D, and hence it is determined uniquely by the 
three quantities D, a and b. 
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Lemma 4.10. [fa has discriminant D > 0 and 8 is equivalent to a then 
B has the same discriminant D. 


Proof. For a = x + yVd write 


AB+B 
Oo EBS EF’ AF -— BE =+1. 


Then 


AB+B\? AB+B 
o( Seer) +b P 


is equivalent to the quadratic equation 


a(AB + B)?+0(AB+ B)\(EB4+ F)+c(EB+F/) 
= (aA? + DAE + cE”)6? + (2aAB + DAF + DBE 4+ 2cEF)6 
| (aB? + bBF + cF’) =0 


whose discriminant is equal to 


(2aAB + bAF + bBE + 2cEF)? 
— 4(aA? + bAE + cE”)(aB? + bBF + cF”) 
= b? — dac = D(a), 


and whose coefficients are coprime. (If there is a common multiple of the 
coefficients then the inverse transformation 


leads to the original quadratic equation for a, with coefficients a, b,c having 


the same common multiple.) 


Theorem 4.8. Let a be a real quadratic irrational number. Then 
(i) the number an, n > 1, in the continued fraction 
Q = [a1,...,@n—1, An] 


has the same discriminant as a; 

(ii) af @ is a reduced number then ap, is reduced for any n > 1 as 
well; and 

(iii) if @ is not necessarily reduced then ay, is reduced for all n suffi- 
ciently large. 


Continued fractions 67 


Proof. Claim (i) follows from Lemma 4.10. Moreover, the defining proce- 
dure of the continued fraction for a implies a, > 1 for all n > 1. 

(ii) If @ is reduced then a = a+ 1/8 for an integer a > 1 and a real 
6 > 1; this implies -1/6 = a—@ > 1, since a > 1 and @ < 0. Therefore 
8 is a reduced number as well. 


(iii) From 
PnOn+1 17 Pn-1 
a= YnAn4+1 = 
QnQn+1 T An-1 
we have 
=O. = = 
Ogg = =e et (4.11) 
dnQ@ — Pn 
implying 
—1Q — Pn— —1 A— Pn— = 
rape = Qn-1 Pn-1 = Qn-1 Pn 1/Qn a (4.12) 


Gn@ — Pn Gn = &@— Pn/In 
Eventually the fractions pp—2/dn—2 and Pn—1/qn—1 become close to a, so 
that both the numerator and denominator of the last fraction are close 
to @— a; in particular, they have the same sign. Consequently, @, < 0. 
Furthermore, 
@ — Pn—1/n—1 = Pn/In — Pn—1/Qr-1 
@ — Pn/Gn @ — Pn/Gn 
(Aye 


QnQn-1(@ — Pn/ Gn) 


Continuing (4.12) we find that 


a aoe it ( le ) 
An = dn — In— = ‘ 
Fai dn * @n(@ — Pn/Gn) 


The expression 
1 
Qn(@ — Pn/ In) 
tends to 0 as n — oo, hence its absolute value is less than 1 for all n 
sufficiently large. This implies @,,,; + 1 > 0 and finishes our proof of 
claim (iii). 


As a side application, we may iterate (4.11) to derive 
Exercise 4.9 (distance formula). Show that for n > 1 
(-1)"*1 
Pn — In’ 


with our previous convention po = 1, qo = 0. 


A2°°* AnAn+i = 
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Hint. Iterate equality (4.11). 


It turns out that one may usefully think of |log \Pn — dn || as measuring 
a weighted distance that the continued fraction has traversed in moving 
from @ to An41- 


4.7 Euler-Lagrange theorem 


Let a be a real irrational number. We say that its continued fraction 
(a1, d2,3,...] 

is periodic if there exists an integer k such that adn4p = ap for all n suf- 
ficiently large and purely periodic if dn4z = Qn for all n > 1; we call k 
the primitive period if it is the smallest positive integer with the above 
property. 

The following standard notation is used for periodic continued fractions: 

[@1,---,@pr,@rtiy---) Orth |; 

where the vinculum (overbar) denotes the periodic repetition of the corre- 
sponding part. A continued fraction is purely periodic iff it can be written 
in the form [@j7,..., Gg |. 


Lemma 4.11. Let a be a reduced quadratic irrational and a an integer. 
Writea = a+1/B. Then GB is reduced iffa <a <a+t1, that is, iff 
a=|a]. 


Proof. If a= |a| then 6 > 1 and -1/8 =a—@>a=|a| >1, hence P is 
reduced. 

Conversely, if a < a then 6 < 0, and ifa+1< athen $ < 1; thus 
@ cannot be reduced if a 4 |a|. 


We point out that the relation between a and 6 in Lemma 4.11 deter- 
mines one of these numbers in terms of the other. Indeed, 


which implies that a = |—1/]. Moreover, @ (and hence a itself) is uniquely 
determined by £ or, hence, by 8. 

We now come to a central result characterising quadratic irrationals in 
terms of the periodicity of their continued fractions. Recall, in contrast, 
that the eventual periodicity of its base-b expansion characterises the ra- 
tionality of the number. This points to the power of continued fraction 
representations over b-ary ones. 
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Theorem 4.9 (Euler-Lagrange theorem). Let a be a real irrational num- 
ber. The continued fraction for a is periodic iff a is a quadratic irrational. 
In the latter case, a is reduced iff its continued fraction is purely periodic. 


Proof. First assume that a is a quadratic irrational. By Theorem 4.8(iii) 
the corresponding tails a, are reduced for all n > no, while Theorem 4.7, 
together with Lemma 4.10, implies the finiteness of the reduced numbers 
that are equivalent to a. Therefore, for some n > no and k > 1 we have 
Qn = Antz. This immediately implies the periodicity of the continued 
fraction. Furthermore, assume that a itself is reduced; by part (ii) of 
Theorem 4.8 all the a, are reduced as well. As we already know, an = Qn+k 
for some n and k > 1. From Lemma 4.11 and the comment to it, we 
conclude that a,_, is uniquely determined by a, and hence that a,_, = 
Qn+k—-1- Applying this descent n times, we finally arrive at a = ay = 441; 
in other words, the continued fraction is purely periodic. 

Conversely, if a continued fraction is purely periodic then it may be 
written as 


a= | Figes5 OK | a [a1,...,a%, a]. 


The relation a = y,a implies that a is a root of a quadratic equation with 
integer coefficients, while by claim (iii) of Theorem 4.8 the number 7a = a 
is reduced. In the case of a periodic continued fraction, we write 


a= [Btrieve Mle, Deiar aan cee | = [Q1,.++ Gp; Ope]; 


where the purely periodic continued fraction a,41 = [Gy1,---,Gr+k | is, 
by the above argument, a (reduced) quadratic irrational. Since a and a,+1 


are equivalent, the number a is a quadratic irrational as well. 


Exercise 4.10. Show that if a is reduced and a = [@,..., @& | then —1/a@ = 
[Gk Ge-1,---, a]. 

Exercise 4.11 (Perron’s theorem). Let 8 be a real number. Show that ( is 
the square root of a rational number > 1 iff there exist an integer b; > 0 


and a finite (possibly empty) palindromic list of positive integers bo,... , by 
such that B = (bi, ba, sey br, 2b1]. 


Sketch of solution. An equivalent way of saying that a list b2,b3,..., bx is 
palindromic is that the matrix 


C)-09G 9-09 
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is symmetric (that is, b= c). Writing 
B= [by be,..., be, 261] = [b1, bo, ..., bp, by + 8] 
b | 1 ras (by +B)+d 
* [bo,.--sbk,b1 +8] 1" a(bi +B) +0 


we obtain a quadratic equation for £, 


af? + (b—c)6 — bi (ab, +b +c) =0, 


whose linear term vanishes iff b = c. 


Chapter notes 


This chapter is reasonably elementary and also independent of the remain- 
ing contents; at the same time it provides us with a natural bridge between 
integer investigations (via an extension of the Euclidean algorithm) and 
diophantine questions (representation as sums of squares and rational ap- 
proximations of real numbers). Continued fractions are a self-standing topic 
in number theory, with many books dedicated to them; one recommenda- 
tion would be [14] which shares the style with this book. 

The Fermat—Gauss theorem about representativeness of primes of the 
form 44 + 1 as sums of two squares is a record keeper among the results 
with multiple different proofs available for; some of them can be found in [1, 
Chapter 4]. One more proof can be cheaply extracted from the equality of 
power series 


foe) 2 foe) 
m2 (—1)"q?"*+1 
(doo) =r Oe 
m=—oo n=0 
highlighted at the end of Chapter 1; both sides represent a q-deformation 
of 7. 


Chapter 5 


Dirichlet’s theorem on primes 
in arithmetic progressions 


5.1 Quadratic residues 


For a positive integer m, we say that two integers a and b are congruent 
modulo m and write a = b (mod m) if their difference is divisible by m. 
All integers that are congruent to a particular number a modulo m form 
the residue class a (mod m). Here are two basic properties of congruences 
you can think about. 


Lemma 5.1. Assume that a = b (mod m) and c = d (mod m). Then 
atc=b+d (mod m) and ac = bd (mod m). 


Lemma 5.2. Assume that ac = bc (mod m) and (c,m) = 1. Then a = 
b (mod m). 


Dirichlet’s theorem whose proof we discuss in this chapter asserts that 
for a fixed pair 1,m of two positive relatively prime integers, there are 
infinitely many primes p =1 (mod m). 


Lemma 5.3. Let m > 1 and a,b be integers such that (a,m) = 1. Then 
all solutions x of the congruence equation ax = b (mod m) form a single 
residue class modulo m. 


Proof. If x9 is a solution of the congruence ax = b (mod m) and a = 
xo (mod m), then clearly az; = b (mod m), so that x1 is a solution as well. 
In the other direction, if x9 and x, are two solutions of the congruence then, 
by subtracting, we get a(x — x9) = 0 (mod m), so that x1 = xp (mod m). 


Euler’s totient function y(m) assigns to each m > 1 the number of 
integers in the range {0,1,...,™—1} (or their related residue classes mod- 
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ulo m), which are coprime with m. For example, y(1) = y(2) = 1 and 
(p") = p* 
Exercise 5.1. Show that y(m) is a multiplicative function and deduce the 
formula 


— p*-! for a prime p and k > 1. 


from this. 


Theorem 5.1 (Euler’s theorem). Let m be a positive integer and a rela- 
tively prime tom. Then a?™ = 1 (mod m). 


Proof. Consider the collection {r1,...,7n}C {0,1,...,m—1} of n = y(m) 


integers coprime to m. Then the collection {ar1,...,ar,} represents differ- 
ent residue classes modulo m such that (ar;,m) = 1 for all 7. This means 
that {arj,...,arn} is a permutation of {r1,...,r,} modulo m; in partic- 


ular, the products [[/_, (arn) and []j_, rj are congruent modulo m. By 
cancelling the latter product in []j_, (arn) = []j-1 7; (mod m) (with the 
help of Lemma 5.3) we arrive at the desired claim. 


Taking m = p in the theorem we get Fermat’s little theorem. 


Exercise 5.2. Compute the product jas r; (mod m) in the proof of The- 
orem 5.1. 


Exercise 5.3. Prove Wilson’s theorem: A number m > 1 is prime if and 
only if (m— 1)! = —-1 (mod m). 


5.2 Infinitude of primes of the form 4n +1 


The next two results are particular cases of Dirichlet’s theorem whose proofs 
can be accomplished by elementary consideration. 


Theorem 5.2. There are infinitely many primes of the form 4n — 1. 


Proof. Assume on the contrary that there are finitely many of them, 
P1,---,Pr say. Then at least one of the prime factors of N = 4p,---p, —1 
must be of this form (otherwise, the number N will be congruent to 1 
modulo 4). On the other hand, that prime is on our finite list because 
(N,p;) = 1 for all 7; contradiction. 


Theorem 5.3. There are infinitely many primes of the form 4n +1. 


Dirichlet’s theorem on primes in arithmetic progressions 73 


Proof. First notice that the congruence x? = —1 (mod p) can only be solved 
in integers x for (odd!) primes p of the form 4n+ 1 (and we construct those 
solutions x in Theorem 4.2). Indeed, if p = —1 (mod 4), so that p = 4n—1 
for some n, then x?) = aP-! = (4?)?n-1 = (-1)?"-1 = -1 (mod p) in 
contradiction with Theorem 5.1. 

The remaining part of the argument is similar to that in our proof of 
Theorem 5.2. Assume on the contrary that there are finitely many primes 
Pi,---,pr of the form 4n+ 1. Then the least prime divisor of the odd 
number N = (2p;---p,)? + 1 has the same form and is not on the list; 
contradiction. 


Exercise 5.4. Let p be prime, p > 3, and the congruence 
x? +a2+1=0 (mod p) 
has a solution x € Z. Prove that p has the form 6n + 1. Deduce from this 
result that there are infinitely primes of this form. 
Exercise 5.5. Let p be prime, p > 5, and the congruence 
gta +o? +¢+1=0 (mod p) 

has a solution x € Z. Prove that p has the form 10n +1. Deduce from this 
result that there are infinitely primes of this form. 
Exercise 5.6. Let p be prime, p > 2, and the congruence 

a? +1=0 (mod p) 
has a solution x € Z. Prove that p has the form 8n + 1. Deduce from this 
result that there are infinitely primes of this form. 
Exercise 5.7. Let p be prime, p > 2, and the congruence 

az” +2=0 (mod p) 
has a solution xz € Z. Prove that p is either of the form 8n + 1 or 8n 4+ 3. 
Deduce from this result that there are infinitely primes of the form 8n + 3. 


Hint. The results of this type are related to calculation of the Legendre 
symbol 
0 ifa=0 (mod p), 
(=) = 4 1 if? =a (mod p) is soluble in z € Z, 
—1 if? =a (mod p) is not soluble in x € Z, 

where p is an odd prime and a € Z is arbitrary. The statement in Exer- 
cise 5.7 is equivalent to the claim that (=) = 1 if and only if p = 1 or 
3 (mod 8). To establish the latter, show that (—2)-)/? = 1 (mod p) for 
such primes only and use the hint to Exercise 5.10 below. 
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5.3 Dirichlet characters 


Fix an integer m > 2. 
A function x: Z — C is said to be a Dirichlet character modulo m if 
the following two conditions are satisfied: 


(1) x(n) 4 0 if and only if (n,m) = 1, 

(2) x is periodic, with period m, that is, y(n +m) = x(n) for all n € Z, 
and 

(3) x is completely multiplicative, that is, y(ab) = x(a)x(b) for all a, b € Z. 


The character 


xo(n) = 


1 if (n,m) = 1, 
0 if(n,m)>1, 


will play a special role in our exposition below; it is called the principal 
character modulo m. Note that (1) = 1; this is included in the defini- 
tion of multiplicative function but is also derivable from (17) = (1)? 
(condition (1)) and y(1) 4 0 (condition (3)). 

In order to describe the set of characters modulo m, we will pass to the 
known algebraic description of the structure of the group (Z/mZ)* consist- 
ing of residue classes relatively prime with m; in particular, |(Z/mZ)*| = 
y(m). As the group in consideration is commutative (or abelian), the fol- 
lowing general result can be applied. 


Theorem 5.4. Every finite abelian group G can be given as a direct product 
of some of its cyclic subgroups. In other words, there are cyclic subgroups 
(hijo; = {ht :k =0,1,...,c¢; —1} C G of order cj, where j = 1,...,7, 
such that 

G = (hijo, +++ (Rpe, = {hat + hE 5 0< ky <c; forj =1,...,r}. 
Then also |G| = cy +++ Cp. 

Of course, in the case of the group (Z/mZ)* there is an explicit descrip- 
tion of the direct product decomposition; we will not use it. First, it follows 
from the Chinese remainder theorem that 

(Z/mZ)*" = (Z/py'Z)* --- (Z/perZ)", 
where p?! ---p?” is the canonical prime factorisation of m. (This is, in fact, 
a hint to solving Exercise 5.1.) Second, the following exercises show that 
subgroups (Z/p%Z)* are cyclic for odd primes p, while (Z/2°Z)* is cyclic 
for a = 1,2 and is a direct product of two cyclic subgroups for a > 3. 
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Exercise 5.8. Let p be an odd prime. 

(a) Show that the group (Z/pZ)* is cyclic, (Z/pZ)* = (c)p-1 for some 
c€ {2,...,p—1}. 

(b) Show that either c or c+ p generates the whole group (Z/p°%Z)*, 
independent of a > 2. 


Hint. (a) You can use, for example, the fact that Z/pZ is a field. 


Exercise 5.9. (a) Verify that (Z/2Z)* = (1)1 and (Z/4Z)* = (-1)o. 
(b) Show that, for a > 3, the group (Z/2°Z)* is the direct product of 
cyclic subgroups (—1)2 and (5)9a-2. 


Hint. (b) Verify first that 5 has order c = 2%~? in (Z/2°Z)* and use the 
fact that {5* : k = 0,1,...,c-—1} and {-5* : k = 0,1,...,c—1} are 
disjoint, as they cover different residue classes modulo 4. 


From now on, assume that the group (Z/mZ)* is decomposed into a 
direct product of some of its cyclic subgroups in accordance with Theo- 
rem 5.4, 


(Z/mZ)* = {hh .-- hr (mod m) : 0 < kj < c; for j =1,...,r}, 


where h; (mod m) are generators of the cyclic subgroups of order c;, where 
j =1,...,r, and q---c, = y(m). If y is a character modulo m then 
x(hj)% = x(h;’) = x(1) = 1, hence x(h;) is a root of unity of degree c;, 
for each 7 = 1,...,m. Suppose we have a collection &,...,&, of roots of 
unity of respective degrees c,,...,¢,. Define the function 


(a) 0 if (a,m) £1, 
a — 
x Biv aghe i eat eee (mod m). 


r 


It is not difficult to observe that the properties (1)—(3) in the definition of 
Dirichlet character are satisfied, thus, x is a Dirichlet character modulo m. 
Furthermore, different collections &,...,& and &,...,€/. induce different 
Dirichlet characters x and x’ modulo m (indeed, since €; 4 €; for some j, 
we have x(h;) # x/(h;)). The correspondence defines a bijection between 
the group (Z/mZ)* and the set of Dirichlet characters modulo m. 


Theorem 5.5. The bijection above is an isomorphism of the groups 
(Z/mZ)* and {xa character modulo m}, where the (commutative) oper- 
ation in the latter is defined by (x1x2)(n) = x1()xa(n). In particular, the 
total number of characters modulo m is equal p(m). 
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We will heavily use the correspondence between (Z/mZ)* and the group 
of Dirichlet characters modulo m, through collections of roots of unity 


1, sof ae 
Exercise 5.10. For m = p an odd prime, show that the Legendre symbol 
0 ifa=0 (mod p), 
x(a) = (2) = 1 if z? =a (mod p) is soluble in z € Z, 
—1 if 2? =a (mod p) is not soluble in x € Z, 


is a Dirichlet character modulo p. 


Hint. Use Euler’s theorem (Theorem 5.1) to show that 


ge Dis (2) (mod p) 
p 


for any a € (Z/pZ)*. 


5.4 Properties of Dirichlet characters 


Lemma 5.4. Let & 41 be a root of unity of degree c > 2. Then 
é=1 
yistao 
k=0 


Proof. Denote S = )*f_4,€*. Then £5 = Sf) €*t1 = S implying $ = 0. 


Lemma 5.5. For0<k<c, we have 
yA =O, 
g 
where the sum is over all roots of unity of degree c. 


Proof. Denote S = a €* and take a primitive root of unity 7 of degree c, 
for example, n = e?7*/°. Then 7* 4 1 and it follows from 7*S = De(né)* = 
S that S=0. 


Theorem 5.6. For m > 2, the value x(n) of a character modulo m is a 
root of unity of degree p(m). 
Furthermore, 


> (n) yp(m) if x = xo is the principal character, 
X(N) = 
ae 0 otherwise, 
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and 


otherwise, 


Dx) = He ifn= ‘ (mod m), 


where the latter summation is over all characters modulo m. 


Proof. The first part follows from the isomorphism in Theorem 5.5. Indeed, 
we have y(n) = €f"- 
roots of unity of respective degrees c,...,c,. But then x(n)" = 1 
where cy +++ cr = p(m). 

Notice that 


..€kr for n relatively prime to m, where €,...,€ are 


and the latter sum consists of y(m) terms equal to 1 if x is the principal 
character. If x is not, then the corresponding collection €1,...,& contains 
at least one root of unity different from 1, so that the sum 


m cl co—1 Cr—1 
Dd, x = 2 He PS Dina GED ar 


vanishes as at least one of its factors does (by Lemma 5.4). 
Similarly, in the case of the sum over characters, the condition n 4 
1 (mod m), (n,m) = 1, means that for at least one character we get y(n) = 
--€%r 4 1, so that at least one exponent kj is strictly between 0 and 
cj. Then 


Yix@= DL eer Se x Sa xs oe =0, 
x 1 &2 


€1,€25---€r 


since the factor )/¢, ee vanishes by Lemma 5.5. 


Lemma 5.6. For real x > 1 and a non-principal character x modulo m, 
the sum 


is bounded: |S(x)| <m. 
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Proof. As the character x is a periodic function of period m and the sum 
yo" X(”) vanishes on the full period, we only need to check the inequality 
for 1 < x2 <m. In this case, 


IS@I< 2 ximl< SD 1<m 


l<n<a l<n<m 


Lemma 5.7. To an integer a, (a,m) = 1, assign the least positive expo- 
nent f for which af = 1 (mod m). Then the set {x(a) : x is a character 
modulo m}, in which numbers appear multiple times, is the set of roots of 
unity of degree f such that each root of unity appears exactly p(m)/f times. 


Proof. The fact that all entries in {x(a) : x} are roots of unity of degree f 
is straightforward: y(a)/ = x(a‘) = y(1) = 1. Now take a root of unity € 
of degree f. From the last formula in Theorem 5.6, 


S=S(E' x(a) +6? x(a?) ++ ETF (a!) 


f 
=e") xa) =e 73. x(a) = om). 
k=1 x x 


On the other hand, the same sum can be computed using Lemma 5.4 and 
the fact that €~!y(a) is a root of unity of degree f (as both € and x(a) 
are). We have 


f 


: Fe 0 if€ A x(a) 
ky (a*) = ee , 
Se telah = Dean {F BEEXE 


Therefore, if g is the number of characters y for which x(a) = &, then 
S = qf. Combining this with the computation of S above we deduce that 


q= 9(m)/f. 


5.5 Dirichlet L-functions and their basic properties 


For x a character modulo m > 2, the Dirichlet LZ-function is defined by the 
series 


n 


As in Chapter 2 we will write s = a + it. 
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Lemma 5.8. The series defining the Dirichlet L-function converges in the 
half-plane Res > 0 for non-principal characters; the convergence is in the 
half-plane Res > 1 for the principal character. The function L(s,x) is 
analytic in the corresponding domain, and its consequent derivatives can be 
computed by term-wise differentiation of the series. 


Proof. If x = xo is the principal character and 6 > 0 is arbitrary, then 


xm} 1. 1 
ns ne — mi+d 


for Res > 1+ 6 implying that the sums 


Sl 
< S- ALre = const 


n=1 


— x(n) 


n> 


n=1 
are uniformly bounded in the domain. Thus, the series for L(s, yo) con- 
verges uniformly there and define an analytic function by the Weierstrass 
theorem. Since 6 > 0 is arbitrary, this function L(s, x9) is analytic in the 
domain Res > 1. 

Assume now that the character y is non-principal and define the sum 
S(t) = iene X(N), so that y(n) = S(n) — S(n— 1) and 
N-1 


N N N 
x(n) _ yn S(n)—S(n—1) _ ~ S(n) S(n) 
n=1 ne =o ne =2e ne n=1 (n+ 1)° 
= 1 1 S(N) 
=e (5 arte) " (N +1) 


By Lemma 5.6, |S(n)| < m; in particular, this implies that S(N)/ 
(N + 1)* +0 as N — ov. For the terms of the sum we have 


ih 1 n+1 5 n+1 : 
— — 3 << -—o— 
sm(4 CE: =)| sins f t au < mjs| [ t dt 


mls| _— mJs| 
= ylte = pits 


in the domain Res > 6. Since the dominant series 
1 
|s| y PEE 
n=1 


converges, we conclude, again appealing to the Weierstrass theorem, that 
the sequence of partial sums 


3 x(n) 


n=1 
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uniformly converges to an analytic function in the domain Res > 4, 
|s| < M; thus, L(s, x) is analytic in the half-plane Res > 0 and we can 
differentiate its series representation there term-wise. 


Lemma 5.9. In the half-plane Res > 1, the representation 
_L (sx) _ 3 x(n) A(n) 
L(s,x) 24 ne 


is valid, where A(n) is the von Mangoldt function (see Section 2.2). In 
particular, L(s,x) does not vanish in the half-plane. 


Proof. The proof of this statement is exactly the same as of Lemma 2.4 
(and Theorem 2.2) earlier. From Lemma 5.8 we have 


Cet ae SS x(n) Inn 


in the half-plane Res > 1, while 


~ xin = x(a) ray => a 


from which the required identity follows. The vanishing of L(s,y) for 
some s with Res > 1 would produce a pole of the logarithmic derivative 
L'(s, x)/L(s, x), while the series representation guarantees none. 


5.6 Euler’s product for Dirichlet D-functions. 
Analytic continuation to the domain Res > 0 


Applying Lemma 2.6 with the choice f(n) = x(n)/n*, where x is a char- 
acter modulo m, we arrive at the following Euler-type product identity for 


L(s, x). 
Theorem 5.7. In the half-plane Res > 1, we have 
-1 
XAP 
(s.x) =[] (1-2) 
Pp 
Pp 
where the product is over all primes. 


One important corollary of this identity is a simple recipe to continue 
L(s, Xo) analytically to the half-plane Res > 0. 
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Lemma 5.10. The formula 


1(s,x0) = (8) [] (1- =) 


plm 
defines the analytic continuation of L(s, xo) to the domain Res > 0. It has 
a single singular point there — the pole of order 1 at s = 1, with residue 


Proof. It follows from Theorems 5.7 and 2.4 that 


Hee H(- oe _ II(1- xolP)) _ (6) IT(1- +) 


Ss Ss Ss 
P 7 pm - plen . 


in the half-plane Res > 1. As ¢(s) is an analytic function in the larger 
domain Res > 0, and its only singular point there is the single simple pole 
with residue 1 at s = 1, we deduce the related implications for the function 
L(s, xo) as well. 


5.7 The nonvanishing of L(1, x) for non-principal 
characters x 


Given m > 2, we now turn to studying the properties of the product 


F(s) = [[ L(s,x), 


where the product is over all characters modulo m. As each of the multiples 
in the product is a series of the form 


Co 


Bn 
ns 
n=1 


for some b, € C, the product also has this form. 


Lemma 5.11. The expansion 
Co 
F(s)= 0 
n=1 
represents an analytic function in the domain Res > 1, for which the 


derivatives can be computed by term-wise differentiation: 


oo k 
F(s) = ( ey , wherek =1,2,.... 


n> 


n> 


n=1 
Furthermore, the coefficients an are non-negative integers, and if n = q?™ 
for some q coprime with m, then an, are positive integers. 
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Proof. Fix a prime p such that p { m and denote by f the least positive 
exponent for which p/ = 1 (mod m). By Lemma 5.7 the set {x(p) : x} 
consists of all roots of unity of degree f, each occurred exactly g = y(m)/f 
times. Denote &,...,€, all such roots of unity, so that (1 — &)t)--- x 
(1 — €;t) =1-—t/. Taking t = 1/p® and using the binomial formula 


a 1)--- r+1 zo oS p= 1\. 5 
(=e I)(-9 (~g ) 5 ees )s 
r=0 r=0 


when |x| < 1, we therefore obtain 


2)" -6-@Y 8)" 


x 


r=0 g-l k=0 
where 
0 if f tk, 
eer ae if f |b. 
g-1 


It follows from the latter expansion that 


nn) m2) - 
ptm ptm 


—kis —kis 
“Up ki Py 7 Py 


ll 
M 


i 
where the integers n involved in the latter sum S- all have their prime 
divisors at most N (and relatively prime with m); here 


pees Una tog tbpe Ri if (n,m) =1 and n = pi? - ++ pr, 
‘s if (n,m) A 1. 


This expression means that a, are always non-negative integers; if n = 
q?™ then all exponents in the prime factorisation of n are divisible by 
y(m), so that all up,.., are positive integers. 
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Take now an arbitrary 0 > 1. Then 


and 


nee) 


x psn 


so that all the partial sums of the series 


of non-negative terms are bounded above by the constant |F'(c)| indepen- 
dent of N. We wish to demonstrate the uniform convergence of 


N 
An 
n> 
n=1 
to F(s) in the domain Res > 0. We have 
| “a x(p)\" 
F(s) — | < |r) - TTT] (1- ») 
n=1 xX pSN P 
x~)\* a 
+MMG-*2) -ys 
xX PSN n=1 
€ re Sed 
E 1 lan E ! Gn 
<i. 
a a hea ee 
n>N+1 n>N+1 


as N — ov, with all the estimates uniform in the domain. Then the Weier- 
strass theorem guarantees the uniform convergence and term-wise differen- 
tiation for Res > a, hence for Res > 1, since a > 1 is chosen arbitrary. 


Lemma 5.12. If F(s) is analytic in the domain Res > 0, then the series 


lo) 
an 
ns 

n=1 


constructed in Lemma 5.11 converges to F(s) in the domain. 
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Proof. Assuming that F(s) is analytic in the half-plain Res > 0, expand 
the function into its Taylor series at s = 2 + ito: 
co 
F(s) = 
k=0 


F®)(2 + ito) 


The series converges in the open disc |s — (2 + ito)| < 2 of radius 2, as it 
entirely lies in the domain of analyticity of F'(s). Then for 0 < o < 2, we 
have 


. . FC) (2 + ito) — (o — 2)* Gn (Inn)* 
F(a + ita) => POE Ht) (6 _ yk = TP" (_y py sant 
k=0 ; k=0 é n=1 
a AS Ga (2= oP lnin)? _ SS Gn (2 — o)*(Inn)* 
=) Ds aes a pes 
k=0n=1 n=1k=0 
—_ = an ~ (2 = o)* (In as _ > an (2-0) nn 
= n2tito k! n2tito 
n=1 k=0 n=1 
eae eee ee 
= n2+ito alt = 2 notito’ 
n=1 n=1 


which establishes the validity of the series expansion for F(s) for any s 
with Res > 0, as the choice of to is arbitrary. It remains to show that the 
interchange of summation over n and over k is legitimate. Indeed, sums of 
the terms 
an (2—0)*(Inn)* 
El n2+ito 


converge uniformly in Res > 6 for any choice of 6 > 0, in either n or k. 
This is true because all a, are non-negative real numbers, so that 


Gn (2—)*(Inn)* 
SS ( )" (nn) 


(Q-¢)* Sa, (inky? =o)" (k) 
k! n2t+tto Ss k! ye n2 cs Ae) 


k! 


n=1 n=1 


and 


On (2 — o)* (In n)* 
s ( )" (nn) 


_ an = (2—0)*(Inn)* an (2-0) Inn 
kl n2tito Ss Dy > Sea 


n? k! n2 
k=0 


k=0 


for Res > 6. 


Lemma 5.13. L(1,y) 40 for any non-principal character x modulo m. 
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Proof. Assume on the contrary that is false and L(1, x) = 0 for at least one 
character y modulo m. Then F‘(s) is analytic in the half-plane Res > 0, 
since the only potential pole of F(s) contributed by L(s,yq) at s = 1 
(according to Lemma 5.10) is cancelled by the zero of L(s, x) at s =1. By 
Lemma 5.12, the expansion 
an 

F(s)= 2 ie 
is valid for F'(s) in the domain Res > 0, so that the series on the right-hand 
side converges for any such s. On the other hand, a, > 0 and a, > 1 for 
any n of the form n = r?°™), where (r,m) = 1. By choosing so = 1/y(m) 
we obtain a presumably convergent series, 


1 oe ae oo 1 - 1 
(am) Do paren 2 d. nilolm) > el 


n=(mk+1)?™ 


which is a contradiction. Thus, F's) cannot be analytic in the domain 
Res > 0 meaning that none of L(s,y) vanishes at s = 1. 


Exercise 5.11. For a character y modulo m, consider the generating func- 
tion 


Q(2,x) = do x(n)2”. 
n=0 


(a) Show that series for Q(x, x) converges in the disc |a| < 1 and that 1 is 
its radius of convergence. 

(b) Prove that Q(x, x) is a rational function of «x. 

(c) For y non-principal character, show that 


bax) = fen) S. 


(d) For non-principal characters y modulo 3 and modulo 4, compute the 
quantities L(1,) and check they are nonzero. 


5.8 Proof of Dirichlet’s theorem on primes in arithmetic 
progressions 


Theorem 5.8 (Dirichlet’s theorem). Let m > 2 and 1 be integers, 
(l,m) = 1. Then the arithmetic progression p = 1 (mod m) contains in- 
finitely many primes. 
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Proof. For a fixed character y modulo m, consider the logarithmic deriva- 
tive of L(s,y) in the half-plane o = Res > 1, where the convergence of the 
series is absolute: 


Hea $x yf seee yop (x2) 


Pp k=1 k=1 


Rs OVP Sy, OD) x(p)? 
s PX T— x(p)/p> * | pe a ne) 
x(p fine x(p)* In p 
a eS 


ap? =p 


In the second sum we have — x(p)| => |p*| — 1 = p? —1 > 4p, because 
p° > 2 for o > 1; therefore, 


x(p cava or te 
2 (p= x10) SPP xa Ips i 


*(p* — x(p — x(P)| 
2Inp 2lnp  2inn 
oe a 
Pp Pp n=2 


is the uniform bound for the sum in the domain Res > 1. We conclude 
that 


along the real axis. 
Define an integer v such that vl = 1 (mod m); it represents a particular 
residue class in (Z/mZ)*. It follows from Theorem 5.6 that 


>= Sxo= YS at im)=om) =? 


P x p:pv=1 (mod m) p:p=l (mod m) 


while from the asymptotics above we obtain 


a = Lx Me 2P y(p) = —S2x(v) tak + O(1) 


ae ee | Bilas) 

Ma! Lane) ea Hayy 
__ (3x0) 
=" rege. 


as s + 17, because from Lemma 5.13 the functions L’(s, y)/L(s, y) are 
analytic at s = 1 for all non-principal characters y. We know from 
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Lemma 5.10 that the function L(s, yo) has a simple pole at s = 1; hence 
—L'(s,xo0)/L(s, Xo) has pole of order 1 there with residue 1. Thus, 


Inp 1 
vim yo P= +00) 
Pp 5 — 
p:p=l (mod m) 


as s + 1* meaning that the series 


Inp 
rs 


p:p=l (mod m) 


diverges; in particular, it involves infinitely many terms. 


Chapter notes 


The first proof of Wilson’s theorem (1770), which is stated in Exercise 5.3, 
was given by Lagrange in 1771. 

The proof of Dirichlet’s theorem given by Dirichlet (1837) introduces 
the notions of Dirichlet characters and Dirichlet D-functions; these inspired 
several branches in mathematics to appear, with interconnections between 
those unified by Langlands’ programme. Particular instances of the latter 
are exemplified [75] through the quadratic Legendre(—Jacobi—Kronecker) 
symbols and the reciprocity law for them, and this study traces back to 
Euler. 

Though we do not pursue quantitative aspects of Dirichlet’s theorem on 
primes in arithmetic progressions, one can naturally extend the argument 
of Chapter 2 to show that 


1 x 
S- 1l~ as © — 00, 
y(m) Ina 


psa 

p=l (mod m) 
when (1, m) = 1. The question about representativeness of primes as values 
of a single-variable polynomial of degree at least 2 (for example, of u?+1) is 
pretty open. Proven results for two-variable polynomials (then of degree at 
least 3, to distinguish from ‘easier’ situations like in Theorem 4.1) — there 
are infinitely many prime numbers of the form u? + v+ by J. Friedlander 
and H. Iwaniec [32] and of the form u® + 2v3 by R. Heath-Brown [49] — are 
already at the boundary of possible for the current methods. 
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Chapter 6 


Algebraic and transcendental 
numbers. 
The transcendence of e and 7 


6.1 Algebraic numbers: basic properties 
and algebraic closedness 


A complex number a is said to be algebraic if there is a polynomial f(x) 4 0 
with rational coefficients such that f(a) = 0. Rational numbers are basic 
examples of algebraic numbers, when such polynomials can be taken linear. 


Lemma 6.1. Let a be algebraic and f(x) € Qa] a non-trivial polynomial 
of least possible degree such that f(a) =0. Then f(x) is irreducible over 
Q. Furthermore, if g(x) € Q[a] and g(a) =0 then f(a) divides g(x). 


Proof. Assume that f(x) is reducible over Q, that is, f(x) = u(x)v(z) 
for some polynomials u(#) and v(#) with rational coefficients, of smaller 
degrees. Then 0 = f(a) = u(a)v(a) implying u(a) = 0 or v(a) = 0, thus, 
contradicting to the minimality of degree hypothesis on f. 

For the second part, divide g(x) by f(x) in Q[a] with remainder: g(x) = 
q(x) f(x) +r(a), where r(a) = 0 or degr(x) < deg f(x). Then 


0 = g(a) = g(a) f(a) + r(a) = r(a), 


hence the minimality of degree hypothesis on f implies r(a) = 0. 


An irreducible polynomial from Lemma 6.1 is called a minimal polyno- 
mial of a. It follows from the lemma all such minimal polynomials of a 
given algebraic number a are rationally proportional to each other. 


Exercise 6.1. Show that for each positive integer n, the polynomial x” — 2 
is irreducible. 


Exercise 6.2 (Eisenstein’s criterion). Assume that for a polynomial 


f(a) = 2" + ana") +--+ +018 + a9 € Z[a] 
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there is a prime p dividing all the coefficients an_1,...,@1,@0 but p? { ao. 
Show that f(a) is irreducible. 


Exercise 6.3. Show that the cyclotomic polynomial ©,(x) = x?~! 4+ a?~? + 
+--+ a+4+1 is irreducible. 


Hint. Use Eisenstein’s criterion. 


Lemma 6.2. Suppose that a polynomial f(x) € Q|2] is irreducible and has 
a common zero with a polynomial g(x) € Q|a]. Then f(x) divides g(x), 
and all zeros of f(x) are zeros of g(x) as well. 


Proof. Let a be the common zero of f(a) and g(x); it is algebraic. From 
the irreducibility of f(a) and Lemma 6.1, the polynomial is a minimal 
polynomial for a and it divides g(a). This immediately implies the claim. 


The degree of an algebraic number a is the degree of its minimal poly- 
nomial. In particular, rational numbers have degree 1, while quadratic 
irrationalities, like 1 = /—1, have degree 2. 


Lemma 6.3. Suppose that y1,...,Y%m are complex numbers, not all zero; 
define their span 


b= {rite Fam 277 © Q forg = 1,.2.,,.m} 


over Q (also known as a Q-lattice). If AL C L then X is algebraic of degree 
at most m. 


Proof. Indeed, the inclusion is equivalent to the system of linear equations 


AY = TiN + 11272 $+ + Timm; 
Aya = ray + 7222 +++ + Tamm; 
Ym = Tm + Tm272 + +++ + Tmm Im; 
or 
O= (rir — A) + riaye2 +++ + Tm Im, 
O=raiy + (r22 — A)ya + +++ + Tamms 


0= Tm1Y1 + Tm272 Seca (rmm _ \)Ym: 


The determinant of the associated matrix, f(x) = deti<j,n<m(Tjx — LOjk), 
vanishes at z = X, as the system has a nonzero solution (71,..., Ym). Since 
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f(x) is a polynomial with rational coefficients of degree at most m, the 
desired claim follows. 


Theorem 6.1. Let a and 2 be two algebraic numbers. Thena+ 8, a— 6, 
ab and a/8 (if 8 4 0) are algebraic, each of degree not more than the 
product of degrees of a and of 8. 


Proof. Denote f(x) = x” + an—12"~! +++ + ao a (monic) minimal poly- 
nomial of a, of degree n, and g(x) a minimal polynomial of 6, of degree k. 
Set m = nk and {y; : j =1,...,m} = {a"B& :r =0,1,....n-1, s= 
0,1,...,4 — 1}, and define the rational span L of the latter set as in 
Lemma 6.3. Then aL Cc L. Indeed, a- a™8* = a’*!* is one of the 
elements y; if r <n —1, while 


att 98 _ a” Bs = —aoB* “aiuto ce dea 3B" € L 
if r =n-— 1. For the same reason, GL C L. The inclusions imply 


(at A)L=aL+8L CL and (a8)L=a(6L)CcaLctl, 


so that the numbers a + 8 and a@ are algebraic of degree at most m by 
Lemma 6.3. For the last part, observe that for 3 4 0 its reciprocal 67! is a 
zero of the polynomial «*g(1/zx) of degree k, hence it is algebraic of degree 
at most k. From the above, we conclude that a/8 = a- 8~! is algebraic of 
degree at most m = nk. 


Exercise 6.4. Show that if 6 4 0 is an algebraic number of degree k then 
its reciprocal 1/6 is also algebraic of the same degree. 


Theorem 6.2 (algebraic closedness of algebraic numbers). Let f(x) = 
x" + Qn_12"!+---+a,2 + a9 be a polynomial with algebraic coefficients 
Qn—1,-+-,01,a9 anda € C a zero of f(x). Then a is algebraic. 


Proof. For each algebraic coefficient a; of the polynomial, denote by k; its 
degree. This time, the set L in Lemma 6.3 is the Q-span of the following 
numerical collection: 


{ob agar --- a2 0 <j <1; 0S jo < ko: 
OSGi hie OG aa ea 


As in the proof of Theorem 6.1 we clearly have aj;L C L for j = 0,1,..., 
n—1. Ina similar way, aL C L, because 


al pd... Adri — ,jt1,Jo ... jJn-1 
aaa ary =a ag 


—1 n—1 
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which is a number from the collection for 7 <n — 1 and is equal to 


a | j| bis 
(—ap — Q10 — ++» — An-10"")agr---alt EL 


for 7 = n—1. It follows from Lemma 6.3 that a is algebraic of degree at 
most nkoky oe Kn-1- 


6.2 Rational approximations of real numbers. 
Non-quadraticity of e 


Theorem 6.3. Let a be a real number and t a positive integer. Then there 
is a rational number p/q such that 


andl<q<t. 


Proof. The proof makes use of an elementary argument known as Dirichlet’s 
box principle or as the pigeon hole principle. For the fractional parts, 


(az) woo lane 0a) = lot) u F tJ Us [*.1), 


t es 
when x runs over integers 0,1,...,t, at least two of the values of {az} fall 


into the same interval. This means that there are two integers 71 < x2 
from the interval {0,1,...,¢} such that 


1 
[tax2} — faxi}|] < 5. 
Choose g = #2 — 2 (so that 1 <q <t) and p= |axe| — |ax1| to get 


|qa — p| = a(a2 — #1) — (Lawe] — Lowi ])| = [{or2} — {ar }| < - 


This concludes the construction of the fraction p/q with required properties. 


As a corollary we deduce the following statement. 


Theorem 6.4 (Dirichlet’s theorem). If a is a real irrational number then 
the inequality 


has infinitely many solutions. 
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In fact, a stronger statement follows from the theory of continued frac- 
tions (see Section 4.4). 


Proof. Taking n = t in Theorem 6.3 we deduce that, for each n = 1,2,..., 
there is a fraction pp/qn such that |a — pr/gn| < 1/(nqn) and 1 < gq, <n. 
The two inequalities imply |a—pn/dn| < 1/q? since ndn > g2. Furthermore, 
1/(ndn) 4 0 as n > ov, so that Ja — pp/dn| > 0 as n > oo. Since the 
equality |a — pp /qn| = 0 is not possible for irrational a, there are infinitely 
many elements in the sequence {pn/dn}92, to accommodate the limiting 


relation. 


Observe that Dirichlet’s theorem fails for rational a = a/b, since for any 
fraction p/q 4 @ we get 


a 


—b 1 1/b_ 1 
a zi Bliss shes ibe : 

bq bg aq 64 
whenever g > b. In fact, the estimate shows that in this case |a—p/gq| > c/q 
for all positive integers g, where c = 1/b. Thus, we can state the following 
irrationality criterion. 


Lemma 6.4. If a is real and there are infinitely many fractions p/q such 
that 0 < |a— p/q| = 0(1/q) as q > cx, then @ is irrational. 


Notice that the existence of infinitely many solutions p/q to the dio- 
phantine inequality 0 < |a — p/q| = o(1/q) implies a posteriori a stronger 
conclusion (thanks to Dirichlet’s theorem): there are infinitely many solu- 
tions p/q to the inequality 0 < |a — p/q| < 1/¢?. 


Exercise 6.5. Show that for all integers p and q > 2, 


The next diophantine results are for the constant 


e= jim (1+=) aes, 
k=0 
= 2.71828182845904523536028747135266249775724709369995... . 


It is classical that the partial sums of the latter series produce excellent 
rational approximations to the number, good enough for proving its irra- 
tionality (in view of Lemma 6.4, for example). We will establish somewhat 
stronger. 
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Theorem 6.5. (a) The number e? is not a quadratic irrationality. 
(b) The number e is not a quadratic irrationality. 


Proof. (a) Assume on the contrary that e? 


that there is a positive integer a and some integers b,c such that ae? + 
be~2 +c=0. 
As we know from Exercise 1.3, the exact power of prime 2 in n! is equal 


is a quadratic irrationality, so 


to 


in particular, v2(2™) = vo(2™4+1) = 2™-1+.--+2+4+1=2™—1. It means 
that the rational number 2”/n! written to the lowest terms as 2°” /M,, has 
Qn = 1 when n = 2” and a, = 2 ifn = 2™ + 1 for some m € N. In what 
follows we choose and fix n of one of these forms to be sufficiently large. 

From the Taylor series expansion with the remainder in the Lagrange 
form, 


2 3 n Ox 
ee ees es eae cs x (1+= ) for some0<@< 1. 


n! n+1 


Denote by 8, and 6_ the corresponding values of e®*/(n + 1) for x = 2 
and x = —2, respectively, so that 


a eden sees 2 (apie) 
e= Sta Rtn ee A oles ate 
M, M, +) 
Qa Qe 
—2 mie 
= 1 cra ae Ba ca 1 2 —/). 
: Ft FO -28-) 


Notice that 0 < 64 < e?/(n+1) and 0 < B_ <1/(n +1); in particular, for 
all n sufficiently large we get 
1 
0< aby + |b] G_ < 3° 
Substituting the Taylor expansions into ae? + be~? +c = 0 and multiplying 
the result by 1, (which is the ‘odd’ part of n!, so it is divisible by M;, for 
all k <n) we obtain 


ax 2° x 26, bx 2°" x 26_-=d 


for some integer d. By choosing n = 2” if b< 0 and n = 2"+1 if b> 0 we 
get the left-hand side equal to 2°"*! (a8, + |b|3_), which is then a quantity 
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in the range between 0 and 1 from the inequality above and 2°"*! € {4,8}. 
Finally, the inequality 0 < d < 1 is not possible for an integer d. The 
contradiction implies that e? is not a quadratic irrationality. 

(b) If e were a quadratic irrationality then ae+be~! € Z for some a,b € 
Z, not simultaneously zero, so that (ae + be~!)? = a?e? + b?e~? + 2ab EZ 
meaning that e? is a quadratic irrationality. The latter is however excluded 
by part (a). 


6.3 Liouville’s theorem on rational approximations 
of irrational algebraic numbers 


The next result is what led Liouville (1844) to show for the first time that 
transcendental numbers exist. 


Theorem 6.6 (Liouville’s theorem). Let a be algebraic number of degree 
n > 2. Then there exists a constant c = c(a) such that for any rational 
fraction p/q we have the inequality 


c 
q q 
Proof. Denote 
f(z) = An" + An—12" | +--+ +a1% +49 € Zz], Gn > 0, 


the minimal polynomial of the algebraic number a with gcd(an, 
G@n—1,---,@1,49) = 1. Its factorisation over C reads 


f(a) = ana a) x T] (a5), 


where Q2,Q3,...,Qn are algebraic conjugates of a. If |a — p/q| > 1 then, 

clearly, |a—p/q| > 1/q”, so that the desired inequality holds with c = 1. In 

what follows we will therefore restrict ourselves to the case |a — p/q| < 1. 
From |a — p/q| < 1 we deduce that |p/q| < |a| +1. This implies that 


n 


n 
Heth 
q ee a q 
n 
< anja — 2] x T] (las| + lal +1) < anja — 2] x (2fa+1)"7}, 
q 


j=2 
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where lal = max{|a|, |@2|,..., |@n|} is the so-called house of algebraic num- 
ber a. On the other hand, f(p/q) 4 0 since f(a) is an irreducible polyno- 
mial of degree n > 2, hence 


r(2\|= Janp” + an—1p""*g +++ + arpg"! +a09"| SI 
oho gq’ ~age 


because all the numbers in the numerator are integral. Combining the two 
inequalities we obtain 


1 
q q 


implying that 


with the choice 


c(a) = on Ss 
Gn (2lal+ 1)"—! 


Liouville’s theorem implies that the diophantine inequality 


P 


0< a2) < 
qd 


does not have solutions in integers p and q > 0. 


Lemma 6.5. If for a real number a, for anyn €N the inequality 
1 
0<ja- 2 < q 


has infinitely many solutions in integers p and q > 0, then a is transcen- 
dental. 


Proof. Assume on the contrary that a is algebraic and choose m > 2 be 
its degree and c = c(a) > 0 the corresponding constant from Theorem 6.6, 
so that ja — p/q| > c/q™ for all p/g. On the other hand, the hypothesis 
implies that there are infinitely many p/q satisfying |a — p/q| < 1/q™*!. 
In this infinite set we pick one with g > 1/c. Then 


Qa < ; 
q qt q™ 


D | 1 c 


which contradicts to the inequality above. 
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Theorem 6.7. The quantity 


is transcendental. 


Proof. Write the partial sums of the series 
n 


i 1 
Se 


dn 
k=1 
where gn = 2™ and pn are certain positive integers, n = 1,2,.... Then 
Pn we 1 
av k=n+1 
satisfies r,, > 0 and 
on 1 1 1 
™ = dit)! * ata) + att3)i—@tiy tT 
1 Df ‘eve 2 
<gan* (1 2 T 22 | -)=(=) x25 atl 
1 
< ae 
In 
forn = 1,2,.... This means that, for each n, the inequality 0 < a—p/q< 


1/q” has infinitely many solutions, namely, 


p {a Pn+i Pn+2 } 
q dn i Qn+1 ; Qn+2 , 


It remains to apply the above corollary to Liouville’s theorem — Lemma 6.5. 


6.4 Bounds for the value of polynomial at an algebraic point 


In order to approach transcendence proofs more generally, we need some 
standard information about algebraic numbers; in particular, a suitable 
generalisation of Liouville’s theorem. The next statement comes from alge- 
bra (so that we do not discuss its proof here). 


Lemma 6.6. Let G be a symmetric polynomial from the ring Z[a1,..., Un] 
of n variables and 
§, = te: + hn, $2 = MW1XQ+++++IXyn_-14n, ..-, Sn =X%1°°' Ln 
elementary symmetric polynomials. Then there exists a polynomial F € 
Zly1,---,Yn| of degree at most deg G such that 
F'(81, 82,.--,;$n) = G(a1,...,2n). 
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The height of a polynomial P(x) = p,x"+---+pia+po is the maximum 
of the absolute values of its coefficients, H(P) = maxo<x<n |pxl- 


Exercise 6.6. The height H(a) of an algebraic number a is the height 
of its minimal polynomial from Z[x], whose coefficients do not possess a 
non-trivial common divisor (a so-called primitive polynomial). Establish 
bounds for the heights H(a + 8) and H(af) from above, where a and 8 
are algebraic numbers, by means of deg a, deg 8, H(a) and H(8). 


Theorem 6.8. Suppose that a is an algebraic number of degree n > 1. 
Then there exists a constant c = c(a) > 0 such that for any polynomial 
P(x) with integer coefficients, either P(a) =0 or 


ck 


Hr-\? 
where k = deg P is the degree of the polynomial and H = H(P) its height. 


|P(a)| > 


Proof. Denote f(x) = an2" +--++a 12+ ao an (irreducible) minimal poly- 
nomial of a with integer coefficients and a, > 0; its zeros are a, = a and 
Q2,-..,Qn. Take P(a#) € Z[x] an arbitrary polynomial. If P(a) = 0 then 
there is nothing to prove, so we assume that P(a) # 0. Then P(a;) 4 0 
for 7 = 2,...,n by Lemma 6.2. Consider the symmetric polynomial 


G(@1,@2,..-,2n) = P(a1)P(a2)--- Pan), 


for which G(a1,...,%,) = F(s1,...,5,) by Lemma 6.6. Then 
G(a1,-.-,An) = []j_, P(a;) 40, while from Viéte’s theorem we obtain 


Glary.--san) =F ate Skt iy), 


an an 


Since the coefficients of the polynomial F' are integral and its degree is at 
most deg G = nk, we conclude from the latter result that 


A 
G(a4,.. .;Qn) = ank 
for some nonzero integer A. Furthermore, for all 7 = 2,...,n we have the 


estimates 
|P(a3)| < H(1 + lag] + fag|? +--+ las") < H(1 + |ajl)* < H(A. + fay", 
where [a] = maxj<;<n |a;| is the house of a, so that 


|G(a1,...,@n)| < |P(a)| H"-1(1 + faye, 
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Comparing this with 


A 1 
CGin.jogi= 


we arrive at the desired estimate for |P(a)| with 


= 0) = 
6S NOY Fan (1 tla 


Exercise 6.7. The height H(P) of a polynomial in any number of variables 
is defined in exactly the same manner as in the single-variable case: it is 
the maximum of the absolute values of all coefficients of P. 

Assume that a and £ are algebraic numbers of degrees n and m, re- 
spectively. Prove the following extension of Theorem 6.8: There exists a 
constant c = c(a, 3) > 0 such that for any polynomial P(x, y) € Z[a, y] in 
two variables of total degree k > 1 we either have P(a, 8) = 0 or 

k 


Cc 


|P(a, B)| > ye 


This exercise is clearly a part of some more general result, which we 


revisit again in Chapter 8 (without providing details of proof). 


6.5 ‘Transcendence of e 


Our proof of transcendence of e relies on an analytical identity, due to 
Hermite, and a simple arithmetic fact. 


Lemma 6.7 (Hermite’s identity). To a polynomial f(x) of degree N with 
real coefficients, assign the polynomial 


F(a) = f(a) + f(a) +--+ fo), 
where the sum is over all (nonzero) derivatives of f(x). Then 
F(0)e* — F(x) = ef f(t)e* dt 
0 
for alla >0. 


Proof. The identity follows from the following repeated integration by 
parts: 
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Lemma 6.8. If g(x) is a polynomial with integer coefficients then so is the 
polynomial 
g(a) _1 a 
kl kl dak? 


(x) 
for any k > 0. 


Proof. Any such g(x) is a Z-linear combination of the monomials 7”, where 
n=0,1,..., hence it is sufficient to prove the statement for them. If k > n 
then («”)*) /k! = 0; otherwise we have 


= x = x [a]. 


kL k) k 


Exercise 6.8. A polynomial P(x) € QJa] is said to be integer-valued if 
P(k) € Z for all k € Z. Examples of such polynomials are (x + 2)(a — 5)/2, 
x(x? — 1)/6 and, more generally, the binomials 


(7) =e ni: (\ 21 


n n! 


Prove that any integer-valued polynomial can be written as a linear 
combination of (*) with integer coefficients. 
Exercise 6.9. If P(x) is an integer-valued polynomial of degree m > 1 
and M is the least common multiple of 1,2,...,m, show that MP’(z) is 


integer-valued. 
Theorem 6.9 (Hermite). The number e is transcendental. 


Proof. Assume, on the contrary, that e is algebraic and choose (a2) = 
Amx"™ +---+a,x+<ap to be its minimal polynomial with integer coefficients. 
Since (2) is irreducible, we have ap # 0. For a sufficiently large n, whose 
choice we will finalise later, consider the polynomial 
1 
f(z) = Gea gS Ie 2)" sein)”, 

Apply Hermite’s identity from Lemma 6.7 with this choice of f(a) and for 
each « = 0,1,...,m, and collect the results into a single linear combinations 
with the related coefficients ag, @1,...,@m: 


m m k 
~S 0 an F (k) = Sane f f(tje* dt, 
k=0 k=0 v 


where we use the fact that )7/".) axe* F(0) = o(e)F(0) = 0. Our aim is to 
show that the expression on the left-hand side of the formula is a nonzero 
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integer when n > max{m, |a@o|} is a prime number, while the right-hand 
side tends to 0 as n > ov. 

It follows from the definition of f(x) that f“(0) = 0 for 7 = 0,1,..., 
nm — 2 and 


FON (0) = (@ = 1)*@— 2)? ++ (e—m)"|,_ 9 = Cm!" 
we also have f%)(k) = 0 for all 7 = 0,1,...,n—1 and k = 1,2,...,m 
From Lemma 6.8 applied to the polynomial 
g(x) = (mI! f(a) =a" Mw —1)"(@ — 2)" (@— my" 
we conclude that the derivatives of f(a) of orders 7 = n,n+1,... are all 
divisible by n = n!/(n —1)!. In particular, f%(k) € nZ for all such j and 


k = 0,1,...,m. Summarising our findings and using the definition of the 
polynomial F(x) we see that 


S- apP(k) = (-1)""m!" ap + nA 


for some A € Z. If n is a prime number satisfying n > m and n > |ao| 
then the integer (—1)""m!" ao is not divisible by n. This means that whole 
expression is not divisible by n, so it is a nonzero integer. The implication 
is the estimate 


S- F(R) Som 
k=0 


On the other hand, we have 


mr-limnr...mn mh 


Ola = “Gti 


for all ¢ in the interval 0 < t < m, so that 


X) = a8 f(t)e* au 


ne aL tye 

m 

= “Gayn eae fe 
mrt l)n— lem uC 


= <1 
~~ (n-1) 2 lael = aa 


for sufficiently large n, since cf,/(n — 1)! > 0 as n > co. The two contra- 
dictory estimates we have obtained for >," ) axF'(k) imply that e cannot 
be algebraic. 
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6.6 Irrationality of 7 
A modification of the argument from the previous section allows one to 
prove the irrationality of 7. 


Lemma 6.9. To a polynomial f(x) with real coefficients, assign the poly- 
nomial 


F(a) = f(x) — f"(@) + £9 (2) — fO@) +. 
Then 


F(0)+ F(r) = i f(t) sint dt. 


Proof. The identity follows from integration of equality 
d 
da 


Theorem 6.10. The number x is irrational. 


(F' (x) sina — F(x) cosx) = (F(x) + F(z)) sing = f(z)sina. 


Proof. Assuming that 7 is rational, 7 = a/b with b > 0, say, apply the 
identity of Lemma 6.9 to the polynomial 


fe) = a"(n 2)" = 5 a"(a- ba", 


where n is chosen sufficiently large. We have f(0) = f’(0) = --- = 
f@-Y(0) = 0, and also f)(0) € Z for 7 > n from Lemma 6.8 ap- 
plied to the polynomial g(z) = «"(a — br)” € Zla]. This means that 
f(0) € Z for all 7 > 0. Because of the symmetry f(a — x) = f(x), we get 
fO (an — x) = (-1)9 f(x), hence f(r) = (—1)9 f (0) € Z for all 7 > 0. 
Combining this we conclude that both F(0) and F(z) are integers, hence 


i f(t)sintdt = F(0) + F(m) €Z. 
0 


On the other hand, the integrand f(t) sint is clearly positive on the interval 
0 <t< 7 and possesses the estimate 
f(t)sint < f(t) < 


bo" 72” 


n! 


there; therefore, 


wT br 7_2rtl 
o< | f(t)sint dt < ——— <1 
0 nN: 


for all large n, since b°1?"*1/n! > 0 as n > oo. As no integer exists 
between 0 and 1, our estimates lead to contradiction. Thus, 7 cannot be 
rational. 
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6.7 Newton’s interpolating series for sin 7z 


Assume that a function f(z) is analytic in a domain D C C, while 21,..., Zn 
is a fixed collection of points in D, with possible repetitions. Define 
Fo(C) = 1 and 


Fy(C) = (¢ — 2p) Fr_-1(€) = (C— 21) -+-(C — 2) fork =1,...,n. 
Multiplying the both sides of elementary identity 
(1 —) - —, where k = 1,...,n, 
by Fr_1(z)/Fr_-1(¢) we arrive at 
1 Gan a) _ Fy_-1(2) 
C—2z\Fe-1(Q) Fe (¢) FRG) | 


Summing them over / we get 


1 F,(z) ae z) 


where k = 1,...,n. 


equivalently, 


2 3 Fy-i() | F(z) 
C2 24) * OCD 
Now take a simple closed contour C' within the domain D, which encloses 
all the points z1,...,Z, and a point z. Multiplying the identity obtained 
by f(¢)/(277) and integrating the result along C' we obtain 


1 £ FQ 
1Q)= mi foe 

ee Ae HO Gin Be OO 

~ Do Feat) a7 Fe, FC) 1 + Ba Lnoe-a* 


The resulting formula is known as Newton’s interpolation formula with 


interpolation nodes 21,..., Zn. 


Lemma 6.10 (Newton’s interpolation). In the above notation, denote 


ee ac 
271 c Fr(¢) 


1 pf RF 
Pn) = a5 f Rea 


Then for z within the contour C, 


f(2)= Ss Arp Fy(z) + Rn(z). 
k=0 


Ak-1 d¢ fork=1,...,n 


and 
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From now on, think of a collection of points 21, z2,... in D being infinite. 
Assuming that 


=a he Re ao +0 asn—- co 


for all z from a domain Dp C D, we then obtain the so-called Newton’s 
interpolating series 


= 3 Aha = Yo Aa Z— 21) -+(z— Zp). 


If f(z) does not happen to be a polynomial, the latter equality involves 
infinitely many terms, hence A; # 0 for infinitely many indices k. 

We will next write Newton’s interpolating series for the function f(z) = 
sin zz using the collection of nodes 21, z2,... as follows. We fix a positive 
integer m and define z, = k fork = 1,...,m and then z+, = z for all 
k, > 1. Choose and fix an arbitrary real R > m, so that all nodes are within 
the disc of radius R, and consider the remainder 


1 ¢ (z— 21)-+++(2— Zn) sina 
2ni Jo (C— 21)+*- (0 — zn)(C— 2) 
for |z| < R and n > 2R, choosing the contour C to be the circle |¢| = n. 
First, |z — z%| < |z| + |zn| < R+m for k =1,...,n. Second, on the circle 
\¢] =n > 2R > 2m we have 


R,(z) = d¢ 


IC - 1 2 Mel — lee] 2 m—m > 5 fork =1,...,n 


and |¢ — z| > |¢| —|z| >n—-— R>n/2. Third, on the same circle we get 


; emt _ ents 
| sin 7¢| = a < e™lSl = e™ 
a 


Combining all these estimates, we deduce that 


1 (R+m)"e™ 
i oe Nee gt ies es 1 
|Rn(z)| < = ies (n/2)r*1 d¢ 
2°+1(R+m)"e™ 


= +0 ano, 
nr 


since R and m are fixed. The quantity R can be chosen from the beginning 
arbitrary large, to include a given z inside the disc of radius R, so that the 
interpolating series is valid for sinaz with any choice of complex z. 
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Lemma 6.11 (Newton’s interpolation for sinaz). Define the collection 
Z1,22,... by zr, =k fork =1,...,m and, recursively, 2m+~ = 2% for all 
k>1. Take z€C. Then 


sin tz = S 0 An(z - Z1)++:(2—2n), 


n=0 


where for n > 2max{m, |z|} the coefficients Ap, satisfy 
|A,| < exp(—n Inn + 5n). 


Proof. It remains to show the estimates for 


1 sin 7¢ 
ees ¢ dé. 
2nt Jigjen (¢ — 21)-++ (C — 2n41) ‘ 
With the help of the bounds above we deduce that 


em et™mt(n+1) In 2 eo” 
[Aol S 3 Gapayere mm = a < 
the required bound. 
6.8 Transcendence of 7+ 
For the periodic collection 21, z2,... defined by z, = k for k = 1,...,m 


and Zm+k = 2p for all k > 1, we can write 


m 


(2 — 21)-++ (2 — 2n41) = es Rye e, 


where the integers rz, for k > 1 satisfy the hypotheses 
rete t+rmtmant+l, 


n 
Ty -1L< tm <%mi1 <0 Sra SS A. 
m 


Then the interpolation coefficients A, in Lemma 6.11 can be given by 


1 sin 7¢ 
An = = d 
ni fn (C—1)ntt..-(C—m)rotl ¢ 


for (any) choice of N > 2m. Denote r = rj = maxi<p<m{rx~} and M is the 
least common multiple of the numbers 1, 2,...,m. 


Lemma 6.12. For each n > 0, there exists a polynomial P,(x) € Z[x] of 
degree at most r and of height not exceeding r!(2M)”" such that 


M”~'r! Ay, = Pa(m). 


106 Analytic methods in number theory 


Proof. Use Cauchy’s residue sum theorem to write 


ie sin 7¢ 


1 
An = = Oni f. (C— 141... (C= m)re ti dé, 


k 


where I; are circles |¢ — k| = $ bypassed in the positive direction. Develop 
the function sin 7¢ into its Taylor series at ¢ = k: 


sin a¢ = sin(wk + 1(¢ — k)) = (-1)* sin (6 — k) 


1)9+k a 2j+1 ae 
k)*I 
a - +1)! ) 


(—1)9 thy 2941 : 
a ee ee 
O<F<(rR-1)/2 ui ; 


where R,(¢) is an entire function with the zero of order at least ry, +1 at 
¢ =k, where k = 1,...,m. Therefore, 


: Pal) 7 
Qri § (C—1ytt... (C— m)rm Fi d¢ =0, 


hence 


1 sin7¢ 
271 £ (¢ —1)m41...(€ —m)rmti d¢ 


(—1)9 tk 2d+1 1 (¢ _ k)234+1 
7 Or nih, Co Car 


dé. 


O<j<(re—-1)/2 
Denote, for each k = 1,...,m, 
1 f (¢ ee ker? 
ak; = 
1 oni Jp, (C= DP (C= myo 
where 0 < j < (rz — 1)/2. 

We shall now check that the numbers az; are rational such that 
M"~ tax; € Z. Each ay; is a residue of the integrand at ¢ = k, that is, the 
coefficient of (¢ — k)~! in the Laurent expansion of the rational function 

(te - Kyat 
(¢ - 1)ritt be (¢ — m)rmrtl 
at ¢ =k. For integers s £ k in the range 1 < s < m, we have 


d¢, 


i 4 1 — 1 Sfe-r\o_ & (=F 
ma ea eee a =a) 7 dis HH 
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implying after r,-repeated un eHOn that 


( al = (-1)"t! > (“)s ee 


=r 


fz +97, (¢—k)é 
= (—])rstt eT 
a = (" \eraerm 
(If r, = —1 then the Laurent series expansion at ¢ = k is simply 1/(¢ — 


s)"=t! = 1.) Therefore, the coefficient of (¢—k)~! in the Laurent expansion 
of 


(C= m4 . 
(C—1)™41..-(C—m)re tl (C— 1)... (C— kre 83 --- (C— mre 
at ¢ =k is equal to a linear combination with integral coefficients of prod- 


ucts 
I 1 
2a (s _ k)lstrst 2 
stk 


for which @; +--+ + €p-1 + gaa t+ ++ + lm = ry — (27 +1). Since |s — hI 
is an integer between 1 and m while M denotes the least common multiple 
of all integers in the range, we have M/(s — k) € Z and 


m letrst1 
M s s 
Z. 
Noa 


s=1 
sZ#k 
Here 
So (ls +18 +1) =(m—1)+ dors - (29 +1) =n- (2741) <n-1, 
s=1 s=1 
stk 


so that indeed M"~1tag; € Z. 
We now summarise our findings as follows: the quantity 


ie | 
r! Maat Ae = S- | S- (DM teary qs 
k=1055<(re-1)/2 
is a polynomial P,(x) € Z[az] evaluated at « = 7, as required. It only 
remains to estimate the height of the polynomial from above. For each ax; 
we use the defining integral representation and the fact that |¢ — k| = 5 
and |¢ — s| > $ for s # k on the contour Ty: 


1 1 
ee gn-l 
lanj| < = x 1X (jae = 


for 0 <j < (rg, -—1)/2 and 1<k<m. Thus, the absolute values of integer 
coefficients of P,,(a) do not exceed 
mx rl2"-!M"1 <rl(2M)". 
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Theorem 6.11 (Lindemann). The number a is transcendental. 


Proof. Assume on the contrary that 7 is algebraic, of degree m— 1. With 
this m consider the expansion of sin 7z in the Newton interpolation series, 


sinatz = S 5 An(z - 21)++:(2— Zn), 


n=0 


where |A,| < exp(—nInn + 5n) for n > 2max{m, |z|}. We choose n suffi- 
ciently large and consider the polynomial P,, (2) constructed in Lemma 6.12, 
whose degree deg P, < r and height H = H(P,) < r!(2M)”. The letters 
C,C1, C2 below are used to denote constants that only depend on m (recall 
that m—1 is the degree of algebraic number 7). It follows from Theorem 6.8 
that either P,,(7) = 0 or 


Cc 


Hm-2 


>e r| Ine|—(m—2)(r Inr+nin(2M)) > exp (- 7=2 ninn— en), 
m 


|P, (7) | > = et ne—(m—2) In H 


where we used r < n/m. On the other hand, from the estimate for A, and 
the fact that P,,(7) =r! M"—1!A,, we find out that 


|P,(m)| < exp(—nlnn + 5n + (n-—1)InM+riInr) 


m—1 
<exp[ — = ninn+con }. 


Comparing the two estimates for |P,(7)| we conclude that they are in- 
compatible for all n sufficiently large, n > N. This means that we have 
P, (7) = 0 for all n > N, so that A, = 0 for all such n, hence sin7z is 
a polynomial. At the same time, it is not (for example, because it has in- 
finitely many zeros on the real line). The contradiction we arrived at proves 
that 7 is transcendental. 


Exercise 6.10. Prove that the function f(z) = sin7z is transcendental. In 
other words, show that there is no polynomial 


with rational-function coefficients P;(z) such that P(z, f(z)) = 0 identically 
in z. 
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Chapter notes 


The proof of Theorem 6.5 is based on the argument of Liouville (1840). 

Dirichlet’s theorem (Theorem 6.4) and Liouville’s theorem (Theo- 
rem 6.6) suggest investigating in general the question about how well a 
real number (not necessarily algebraic!) can be approximated by rationals. 
A standard way for measuring the quality of rational approximations to 
a € Ris in terms of the irrationality exponent (a), which is defined as the 
supremum of ~ > 0 for which the inequality 


1 


0< a-H<s 
qd 


git 
has infinitely many solutions in integers p and q # 0. Then Dirichlet’s 
theorem translates into the inequality p(a@) > 2 for all irrational real a, 
while Liouville’s theorem tells that u(a) <n for algebraic real a of degree 
at most n. The last result is in fact best possible for quadratic irrationalities 
(when n = 2)—this follows from Theorems 4.4 and 4.9, but not for n > 2. 
Roth’s celebrated theorem [68] proves that j4(@) = 2 for all algebraic a € R, 
and with a simple argument from the measure theory one can show that 


the irrationality exponent 2 is for almost every real number (in the sense of 
Lebesgue measure). But it is not always 2! Already the Liouville number a 
in Theorem 6.7 has ju(a:) = oo, as follows from the inequalities established 
in its proof. While showing that y(e) = 2 is considerably simple (see, 
for example, [14, Section 2.12]), estimating the irrationality exponent from 
above of other ‘interesting’ irrational constants is a competitive business. 
The latest record bound [85] set for the number 7 is u(m) < 7.103205... 
(though we expect it to be 2). 


This page intentionally left blank 


Chapter 7 


Irrationality of zeta values 


In this chapter we discuss arithmetic properties of the values of Riemann’s 
zeta function ¢(s) at integers s = 2,3,4,.... 

As we already know from Section 3.3 (see Proposition 3.7), the values 
of Riemann’s zeta function ¢(s) at positive even integers s = 2k happen to 
be rational multiples of 7?*, where k = 1,2,... . Now, using the fact that 
m is a transcendental number (Theorem 6.11) we end up with the following 
immediate corollary. 


Theorem 7.1. The value ¢(2k) of Riemann’s zeta function at an even 
integer s = 2k is an irrational and transcendental number. 


Much less is known on the arithmetic nature of the zeta values at odd 
integers s = 3,5,7,...: in 1978, Apéry proved [3,64] the irrationality of the 
number 


= 1.20205690315959428539973816151144999076498629234049 . . ., 


and there are more recent but partial linear independence results of Rivoal 
[67] and others. Rivoal’s theorem [67] settles the infinitude of the set of 
irrational numbers among ¢(3),¢(5), ¢(7),... . Conjecturally, each of these 
numbers is transcendental, and a complete answer to the above-stated ques- 
tion, about polynomial relations over Q for the values of ¢(s) with s > 2 
integer, looks rather simple. 


Conjecture 7.1. The numbers 


™, 6(3), (5), C(7), 6(9); --- 


are algebraically independent over Q. 
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This conjecture may be regarded as a mathematical folklore. It seems to 
be unattainable by the present methods. Below we give a proof of Apéry’s 
result and then discuss a partial result about the irrationality of other odd 
zeta values. 


7.1 Arithmetic of linear forms in 1 and ¢(3) 


For n = 0,1,2,..., consider the rational function 


9. (t) = Ga De=2)-t=n) _ Bat-9 


As the degree of its numerator is less than that of its denominator, its 
partial-fraction decomposition assumes the form 


Lemma 7.1. The coefficients cy, where k = 0,1,...,n, are integers. 


Proof. The standard procedure of expanding a rational fraction into the 
sum of partial fractions leads to 


Tf - J) 
Trot +9) x Teng t +7) 


RES com) 


fork =0,1,...,n. 


ce = Qn(t)(E+%)|,__, = 


t=—k 


Exercise 7.1. Show that the coefficients in the partial-fraction decomposi- 
tions of the rational functions 


n! [Testa 2" tH 2) 
IbsGta) “Thaaeta)-° pase 
2°" Tjalt 5 +9) eas 22rTT" (t+n—44)) 

ITjo(é + 3) [Tj-o(t +3) 


possess the same property as displayed in Lemma 7.1. 


Denote d, = lem(1,2,...,n). The asymptotics of this quantity is con- 
trolled by the prime number theorem. 
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Lemma 7.2. We have 
Indy, 


n> co nN 


— Wi Be 


? 


in other words, dn grows with n like e"t°™ as n— oo. 


Proof. It is not hard to see that d,, is a product over primes p < n entering 
with exponent k such that p* <n. This means that 


Inn 
Ind, = )> kinp= | | inp= Y(n), 
pk<n pn 


where w is Chebyshev’s function from Section 2.6. Thus, the required 
asymptotics follows from Lemma 2.10 and Theorem 2.8. 


We now turn our attention to the rational function 
n 2 
_4(t— j 
Ra(t) = Q(t)? = Tjas(t= 37 
I j=0 (¢ +3) 
which plays a special role in our construction of rational approximations to 


¢(3). 


Lemma 7.3. The rational coefficients in the partial-fraction decomposition 


al) => (eine tet) 


k=0 
satisfy the inclusions ax, € Z and d,by € Z fork =0,1,...,n 


Proof. Notice that a decomposition of rational function into the sum of 
partial fractions is unique. Use the partial-fraction expansion of Lemma 7.1, 


n 


Balt) = er) ->-(742) +e een 
ee 


k=0 
= = 4 = y CkC] aC 1 ), 
a G+ kh2 
=a eas rar etn ae t+k t+l 
kl 
implying 
n+k\*(n\? SY se 
a= d= ( - ) @ and by = 2c > fork =0,1,...,n 
1=0 


lZzk 
Since |! —k| < n in the latter sum, the resulting formulae for az, and b; give 


us grounds for the required inclusions. 
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Finally, consider the sequence 


 dRn 
n=R é 1 
aig cera (7-1) 
m=1 m 
Lemma 7.4. For each n = 0,1,2,..., the quantity r, can be represented 


in the form Tn = dn€(3) — pn with qn € Z and Bp € Z. 


Proof. We have 


m=1k=0 t=m 
=P (m +k) +d, t (m+h)? 
n k it n k 1 
= 25° ax <3)- a) + (c2)- oz) 
k=O l=1 k=0 (=1. 


Observe that 
n 


by = S- Res;=—4 Rn(t) = — Resto Rn(t) 
k=0 k=0 


by the residue sum theorem, and Resj=oo Rn(t) = 0 because R,,(t) = O(t~?) 
as t > oo. It follows that r, = gn¢(3) — pn, where 


n 7 k ds 
dn =25 0%, €Z and Pn =25 oa at +See 
k—=0 k=0 = 


k=0 l=1 


Finally, the inclusions d3p,, € Z follow from the explicit formula for p,, and 
Lemma 7.4. 


The numbers 


showing up as the coefficients of ¢(3) in the linear form are known as the 
Apéry numbers. 


Exercise 7.2 (Apéry’s recursion). (a) Verify that 


ro = 2¢(3) and ry = 10¢(3) — 12. 
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(b) Define S;,(t) = Sn(t)Rn(t), where s,,(t) = 4(2n+1)(—2t?+t+(2n+1)?). 
Check that 
(n +1)? Rn4i(t) — (2n +.1)(17n? + 17n + 5) Ry(t) + n? Ra_1(t) 
= S,(t+1) — S,(t) 
forn=1,2,.... 
(c) Using part (b), show that the sequences {rn}, {dn} po and {pn }%o 
satisfy the same(!) recurrence relation 
(n+1)? rps1—(2n+1)(17n?+17n+5)rztn rz_1 =0 forn=1,2,.... 
(7.2) 


The argument for deducing Apéry’s recursion (7.1) from the exercise is 
known as creative telescoping [81,84]. 


7.2  Apéry’s theorem 


It remains to determine the growth of the linear forms rp = gn¢(3) — Pn 
constructed in Lemma 7.4 as n — oo. For that we will use Stirling’s formula 
for the gamma function and apply the saddle-point method from analysis. 


Lemma 7.5. For the sum Trp in (7.1) the following integral representation 
is valid: 


Tn 


1 C+ioo T 
sin mt 


) Rat dt, 


where in the contour of integration Ret = C one can take any C in the 
interval0 <<C <n-+l1. 


271 C—ioco 


Proof. Fix N > nand consider the rectangular contour (positively oriented) 
with vertices at C4iN and N+ 4 +iN. The function 7/sin zt is bounded 


on the sides Imt = +N of the contour: for example, for t = « —iN, we 
find that 
T _ 20 = Qne-T™N 
| sin rt] _ Jem N +iz) = e—t(N+iz) | = lena = e-T(2N+iz)| 
one 7 _aN 
S [- eae <4 


and the same bound is valid for t = x +iN. It is also bounded on the side 
Ret = N+ 4 of the rectangle: when t = N + 5 + 1y, we get 


T T 
<Qre7 74! < Qn. 


| sin 7s| ~ cosh Ty 
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The function R,(t) is O(t~2) as t > oo, hence it is O(N~?) on the three 
sides of the contour. By performing the limit as N — ov, it follows that 
the complex integral 


Oni 


C+ioo 2 
: ( Z ) Falta 


C—ico \sinmt 


equals the sum of the residues of the integrand at the poles t = m, where 
m runs over the integers satisfying m > C. Since 


62) ~¢ aE 7) 


and 
R,,(t) = Rn(m) + Ri,(m)(t — m) + O((t — m)?) 


as t > m, we conclude that 


2 
T Vl 
Rest=m (= =) R(t) — R, (m). 


It remains to notice that Ri,(m) = 0 for m = 1,2,...,n, so that 
2 
T — / = / =. 
s Retiem( =") R(t) = So Rim) = Yo Ri (rn) = =F. 
m>C m>C m>1 


Using the properties of Euler’s gamma function I(t) (see Section 3.1) 
we observe the expression 


n \? n —t)? a 
r= ( ) ay = (7.3) 


sin mt T(n+1+t)? 


for the integrand in Lemma 7.5. 


Lemma 7.6 (Stirling’s formula). In the half-plane Ret > 0, 
1 
InT(t) = (: 5) Int —t+InV2z + p(t), 


where the error term p(t) satisfies |p(t)| < c(Ret)~! for some absolute 
constant c > 0. 


As proving this formula is beyond our scope here, we only highlight 
some underlying ideas behind the proof. 
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Sketch. For the logarithmic derivative of the gamma function, one can show 
that 
v(t) d Sh est et 
= —InI(t) = — - du; 7.4 
I(t) dt al ih ( u T-eu } (74) 


this formula is due to Binet. Integrating this equality intelligently, one gets 


co —tu 
InT(t) = (+— 5) me—t+ in v2 i: (5 : : )s du. (7.5) 
2 0 \2 wu 


eu —1 U 


By choosing c to be the maximum of 


E 15 1 1 


U 


2 ui e—i1 


on the real half-line u > 0 (and one can check that the expression is bounded 
there), we finally find that 


Jai se f je |du=e f e-Petudy = © 
0 0 


Ret 


Exercise 7.3. (a) Deduce formula (7.5) from Binet’s (7.4). 
(b) Complete the proof of Lemma 7.6. 
(c) Prove Stirling’s asymptotic formula for the factorial function 


n 
n 
nl ~ V2an (2) as n — oo, 
e€ 


and its corollary 


2n Q2n a 
~ asn 
n /7n co 


for the central binomial coefficients. 


Lemma 7.7. As n > o, the following asymptotics is valid: rl” 


(v2 —1)*. 


Proof. In the integral representation of Lemma 7.5 take C = (n + 1)/V2 
and change the variable t = (n+ 1)z. The real parts of n+ 1+t,n+1-t 
and t are bounded by c,n on the contour of integration for some c, > 0, 


— 


hence application of Lemma 7.6 to the integrand (7.3) results in 

In F(t) = (2n + 1 — 2t) n(n +1 —t) —2(n +1 —t) + (4t — 2) Int — 4t 

— (2n +14 2t)In(n +144) 4+ 2(n +144) 4 2In(2z7) + O(n?) 
= 2(n + 1)f(z) + nh(z) — 2In(n + 1) 4+ 2In(27) + O(n“?), 
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where 


f(z) =(-2z)nQ -— z)4+2z2inz-(1+2)n(14+2z), A(z) = =~ 


and the constant in O(n~') is absolute. This implies that 


ef atmetyg(e) 142 _ 
22(1— z) 
for z9 = 1/V2 and some absolute constant in O(n~!). 

Consider the function g(y) = Re f(zo +iy), that is, the real part of f(z) 
on the contour of integration. We have 


d 7 df 
g(y) = —Im e 


ln 


(1+ O(n7")) dz 


ne Z9—t0o 


= ImIn(z~? — 1), 
z=zotty 


hence dg/dy vanishes at y = 0 only. In a neighbourhood of the point we 
get g(y) = g(0) — 2°/2y? + O(y?), so that g(y) has its maximum at y = 0. 
Then 


f(z) = f(zo)4 i z0)"4 O((z z0)°) = g(0)+ 2 (z z0)"4 O((z z0)°) 
on the contour of integration — the maximum of |e/()| is attained at z = zo 
and it is equal to e/o), Thus, we obtain 


lim ri/(t+)) — ¢?fl20) = (/2 —1)4, 


n—->co 


and the result follows. 


Theorem 7.2 (Apéry’s theorem). The number ¢(3) is irrational. 


Proof. Assume on the contrary that ¢(3) is rational, ¢(3) = a/b for some 
a,b € Zyo. Since ri/ " tends to a positive quantity as n > oo, we conclude 
that r,, does not vanish for all n sufficiently large. In particular, the integral 
numbers bd3r, = ad3q, — bd?.p, are nonzero for all such indices n; this 
implies that |bd3r,| > 1 for all sufficiently large n. On the other hand, 
|bd3 r,|1/" > €3(./2—1)4 = 0.59... < 1, so that |bd3r,| < 1 for all n large. 
The contradiction means that our assumption ¢(3) € Q is invalid. 


7.3 Arithmetic properties of special rational functions 
In this part, which is spread over Sections 7.3—7.6, we generalise the con- 
struction from Sections 7.1—7.2 to prove the following result. 


Theorem 7.3. At least one of eleven numbers ¢(5),¢(7),...,¢(25) és irra- 
tional. 
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We fix an odd integer s > 7. Our strategy is constructing two sequences 
of linear forms r, and #,, living in the Q-space Q + Q¢(3) + Q¢(5) +--+ + 
Q¢(s), for which we have a control of the common denominators 2, of 
rational coefficients and an elementary access to their asymptotic behaviour 
as n — oo; more importantly, the two coefficients of ¢(3) in these forms 
are proportional (with factor 7), so that 7r, — f, belongs to the space 
Q4+ Q¢(5) +--+ + Q¢(s). Finally, using 7rp — fp > 0 and the asymptotics 
An(7r'n — fn) 3 0 as n — co of the linear forms 


An(7?'n — fn) € Z+ ZC(5) + ZC(7) +--+ ZC(s) 
when s = 25, we conclude that it cannot happen that all the quantities 


¢(5), ¢(7),...,¢(25) are rational. 
More precisely, our linear forms assume the form 


t= > Rv) and *, = > > Rp(v— 5), (7.6) 
v=1 v=1 


where the rational-function summand R,,(t) is defined as follows: 


ne? jai = ie : Tjai¢ +n 3) aoe Dee Sli 7 ea) 
ITj=0(t + 3)° 
7 2emnl*~* TTtolt = n+ 2d) 7.7) 
[Tj=0(¢ qegyet 
We first discuss a general rational function S(t) of the form 
P(t) 
(t — t1)**( — ta)? ---(E—tq)®a” 


whose denominator has degree larger than its numerator, so that its unique 


R,,(t) = 


S(t) = 


partial-fraction decomposition assumes the form 


The coefficients here can be computed on the basis of explicit formula 
1 -\ (8;—4) 

— (S(t)(t-t,;)*)°? 

for all 1,7 in question. (This procedure is seen in action in the examples 

discussed in Lemma 7.1 and Exercise 7.1, when all the exponents s; are 

equal to 1.) It also means that the function R(t) in (7.7) can be written as 


Rt) =>- >> ray (7.8) 


i=1 k=0 


bij = 


tat; 
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with the recipe to compute the coefficients a; in its partial-fraction de- 
composition. At the same time, the function R(t) is a product of ‘simpler’ 
rational functions given Lemma 7.1 and Exercise 7.1, with all coefficients 
of their partial fractions being integral. 


Lemma 7.8. Let k,,...,kqg be pairwise distinct numbers from the set 
{0,1,...,n} and s1,...,8q positive integers. Then the coefficients in the 
expansion 
bs 
palit hy)*? Sp Se (+ ky) 
satisfy 
d*b,; EZ, where i=1,...,8; and j=1,...,q, (7.9) 


where 8 = 81 +++++ Sq. 
In particular, 
dai, €Z, where i=1,...,s and k=0,1,...,n, (7.10) 
for the coefficients in (7.8). 


Proof. Denote the rational function in question by S(t). The statement is 
trivially true when q = 1, therefore we assume that q > 2. In view of the 
symmetry of the data, it is sufficient to demonstrate the inclusions (7.9) for 
j =1. Differentiating a related product m times, for any m > 0, we obtain 


ba,.£g20 j=? 
bote-tlg=m 


This implies that 
q 
{sj +l;-1 1 
bee es ce 
= Se eC eee 


for i = 1,...,81. Using dn/(kj — ki) € Z for j = 2,...,q and Y"*_,(s; + 
é;) = s— i for each individual summand, we deduce the desired inclusion 
in (7.9) for 7 = 1, hence for any ). 
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The second claim in the lemma follows from considering R(t) as a prod- 
uct of the ‘simpler’ rational functions from Lemma 7.1 and Exercise 7.1. 


Lemma 7.9. For the coefficients a; in (7.8), we have 
aj4 = (1) ajank. for RH 0 1,s1.,n and 4 =1,..48, 
so that 
n 
Le Qik =O fort even. 
k=0 


Proof. Since s is odd, the function (7.7) possesses the following (well- 
poised) symmetry: R(—t — n) = —R(t). Substitution of the relation into 
(7.8) results in 


2555 ik SS wk FS Wy Qi,k 
2d +m emery 2 "2, Gen By 
: ix Gin-k 
= GT th 


and the identities in the lemma follow from the uniqueness of decomposition 
into partial fractions. The second statement follows from 


S- ai, = (-1)** S- Gin—k = (—1)**! S- Qik: 
k=0 k=0 k=0 


7.4 Arithmetic properties of linear forms in zeta values 


We now take a closer look at the quantities defined in (7.6). 
Lemma 7.10. For each n, 
— y ajC(t) +a9 and Fy = oe a;(2* — 1)¢(4) + do, 
2 


= 1=2 
4 odd 4 odd 


with the following inclusions available: 
ds‘a,€Z fori=3,5,...,8, and dSao, dia € Z. 


Notice that 


for 7 > 2. 
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Proof. Our strategy here is to write the series in (7.6) using the partial- 
fraction decomposition (7.8) of R(t). To treat the first sum r,, we addition- 
ally introduce an auxiliary parameter z > 0, which we later specialise to 
z=: 


=> Fal 2 zo 


v=1i=1 k= as 
k 
fe k z 
= Vane A = ain (tile) - 25) 
i=1 k=0 v=1 i=1 k=0 l=1 
s n s n ek Z yo (h-O 
: 2 i,k 
=D Li) Dane — Da 
i=l k=0 i=1 k=0 £=1 
where 
OC Le 
; Zz 
Li;(z) = 7 
é=1 
for 7 = 1,...,s are the polylogarithmic functions. The latter are well 


defined at z = 1 for 7 > 2, where Li;(1) = ¢(2), while Lij(z) = — log(1 — z) 
does not have a limit as z + 1~. By taking the limit as z — 17 in the 
above derivation and using R,(v) = O(v~?) as v > oo, we conclude that 


n n 
y a,.% = lim ) Gee” = 0, 
, Z—17 : 
k=0 k=0 


and 


8 


m= 2 6) OS on Loy p (7.11) 


t=1 k=0 


We proceed similarly for 7,,, omitting introduction of the auxiliary pa- 


rameter z. Since R(t) in (7.7) vanishes at t = —4,—3,...,—n = +, we 
can shift the starting point of summation for 7, to t = —m — 4, where 


m = [45+], so that 


— Co R, Aaa loc) s ik 

3 2s wa) 2 LD wrk 
1 

a (v+k— 4) 


1 
“Loe Ye peep a ee 


i=1 k=0 v=—m 
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i 1 
i=1 k=0 =k—m 2 l=1 2 
Ss n lee) 1 k—-m-1 1 
+ Aik ( i I :) 
i=1 2s ne (¢— 3) d (¢- 5) 
s n s om m—k : 
a ee 
a Pa : j : 
2 ae ; are a e+ 5) 


(7.12) 


| 
M4: 
= 
= 
_~ 
SS 
| fe 
ers 


Now the statement of the lemma follows from the representations in (7.11) 
and (7.12), Lemma 7.9, the inclusions (7.10) of Lemma 7.8 and 


k 
‘ 1 
di, 5 xe for O<k<n and i>1, 


l=1 
m—-k 
-1): 
dy, ( y ¢Z for O0<k<m and i> 1, 
Fs) 
k—-m-1 
3 1 
di-41 ny EZ for m+1<k<n andist. 
waa_—«((E a 


7.5 Asymptotic behaviour 


In this section we make frequent use of Stirling’s asymptotic formula for 
the factorial function from Exercise 7.3(c). 
Because the rational function R,(t) in (7.7) vanishes at 1,2,...,n and 


at 4, 3 re id 5, the sums (7.6) can be alternatively written as 
lo e) co co CO 
m= s Rn(v) = So ce and f, = S- Rr(v—- 3) => 0 &, 
v=n+1 k=0 v=n+1 k=0 


with the involved summands 


26rnls—5 Ton (kK +14 43) 
=Ri(n+1+k)= 7 ca - Z 
Ck (n ) [Folate + 1 + j)stt 


_ nl83(6n + 2k + 2)! (n+ k)lstt (7.13) 
~  2(2k +1)! (Qn+k+1)!st1 
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and 


1 ay — rn cole +5 + 3a) 
& =R(nt+5+k)= 7 eGo 
Tjeoln +k +543) 


strictly positive. Observe that 


Ge Thjno(2k +244) . (T72e) 
Ck Thjno(2k +144) ght el ey 


An+2k4+2)\ \ S+1 
_ n+ 2k +2- (2-ven a) 


2n+2k 

Q2k+1 ( sein ) 

6n+2k+2( ntk \StV? 
k ’ 14 
2k +1 Gores) ee oe 
Lemma 7.11. For s > 7 odd, 

lim ri/n = lim pi/n =g(xo) and lim S =1 
n—oo n—oo N00 Ty, 


where 
28 (a + 3)®(a + 1)8+! 
g(e) = 2e+9@+) 
(x + 2)2(s+1) 


and xq is the unique positive zero of the polynomial 
a(a + 2)84)/2 — (¢ 4+. 3)(e9 + 1)6tY/2, 


Proof. We have 


Chay (k+3n+ 3)(k+3n 4 2) ( ktn+1 ie ~i(5) (7.15) 


Ge (k+1)(k + 3) k+2n+2 


asn+k— co, where 


f(x) = 


g+3f(a4+1\Stv? 
ca) 


ax 


For an ease of notation write g = (s + 1)/2 > 4. Since 


i oe a? ( 1 1 )-£ 3)a2 + 3(q — 3) —6 
f(z) c+3 etl x+2/ x(x +1)(24+2)(z +43) 


and the quadratic polynomial in the latter numerator has a unique positive 
zero £1, the function f(a) monotone decreases from +00 to f(x1) when x 
ranges from 0 to 2; and then monotone increases from f(x1) to f(+o0) = 1 
(not attaining the value!) when x ranges from x; to +oo. In particular, 
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there is exactly one positive solution xq to f(z) = 1. Notice that 0 < 
xo <1, because f(1) = 4- (2/3)? < 1. 

The information gained and asymptotics in (7.15) imply that cz41/c, > 
1 for the indices k < apn — y/n and cey1/cp < 1 for k > aon + y/n for 
an appropriate choice of y > 0 dictated by application of Stirling’s formula 
to the factorials defining c, in (7.13). This means that the asymptotic 
behaviour of the sum r, = 3a cz is determined by the asymptotics of 
Cro and its neighbours cz, where ko = k(n) ~ ron and |k — ko| < 7/7, so 
that 


; eel 
lim r}/” = lim cl! 
n—->co n—->co 


a (s—5)n 6n + 2ko +2 6n+2ko+2 é 2ko+1 
e 2ko a 1 


(mt ky \ Gt Drtho) . (s+1)(2n+ko+1)\ 1/” 
e 2n+ko +1 


(2x0 ae 6)2%0+6 (74 at 1)(s+1)(eo+1) 
(Bg)? + TED 
2° (xo + 3)® (xp +: 1)8*4 + 
= Gye DRE — *£(20)"*° = 920). 


It now follows from (7.14) that 


i 
iar 
bE 

fo 
——N 
fan) 


Na 


Ck+1 = Ck+1 
Ck Ck 


asn+k—> oo, (7.16) 


so that the above analysis applies to the sum fr, = Saar Cr as well, and 
its asymptotic behaviour is determined by the asymptotics of ¢,, and its 
neighbours ¢,, where ko = ko(n) ~ aon and |k — ko| < 4./n. From (7.16) 
we deduce that the limits of ea and es as n — oo coincide, hence 
ph ss g(a) as n — co. In addition to this, we also get 


: Tn . 
ed Gigs CH. hia 


- 6n + 2ky +2 k Gri 
Cko(n) Mes lim nee (5 — 2) = f(z0), 
n 0 


which leads to the remaining limiting relation. 


7.6 One of the numbers ¢(5),¢(7),...,¢(25) is irrational 


We now combine the information gathered about the linear forms r,, and 
fn to conclude our proof of Theorem 7.3. 
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We choose s = 25 and apply Lemma 7.11 to find out that 7r,—f, > 0 
for n sufficiently large, and 
lim (7P'n — ®,)1/" = g(x) = exp(—25.292363...), 
where xo = 0.00036713... is the positive zero of a(a+2)!3—(#+3)(x4+1)8. 
Assuming that the odd zeta values from ¢(5) to ¢(25) are all rational and 
denoting by 6 their common denominator, we use Lemmas 7.2 and 7.10 to 
conclude that the sequence of positive integers 


bd (7rn — Fn) 


tends to 0 as n + oo; contradiction. Thus, at least one of the numbers 
¢(5), ¢(7),...,¢(25) is irrational. 


Chapter notes 


Numerous proofs of Apéry’s theorem (Theorem 7.2) are now recorded; in 
our exposition we follow closely the version given in [58]. The story behind 
the original proof of Apéry [3] (together with the completed proof!) is 
beautifully presented in [64]. A proof given shortly after by F. Beukers [12] 
uses real-valued triple integrals; it is still considered as most elegant and 
sources further research [15, 16,66] in the irrationality direction. A 2004 
historical account of the development around Apéry’s and Rivoal’s theorem 
can be found in [30]. 

The proof of Lemma 7.11 (elementary asymptotics of linear forms) is 
inspired by the methodology and examples from de Bruijn’s book [17], 
which is a definite recommendation for learning techniques in asymptotics 
analysis. 

The trick, used in Theorem 7.3, of eliminating an ‘unwanted’ term of 
¢(3) in linear forms in odd zeta values finds further applications. One of the 
most recent news in this direction is the result of L. Lai and P. Yu [51] that 
at least. 74 \/s/Ins numbers on the list ¢(3),¢(5),...,¢(s) are irrational, 
where s > 10+ is odd; this in turn builds on the earlier work [31]. 


Chapter 8 


Hilbert’s seventh problem 


The following problem posed by Hilbert in 1900 was resolved in the 1930s 
independently by A. Gelfond and Th. Schneider. 


Theorem 8.1 (Hilbert’s seventh problem, Gelfond—Schneider theorem). 
Let a and B be algebraic, a 40,1 and 8 irrational. Then a? is transcen- 
dental. 


In this chapter we expose two different proofs of Theorem 8.1. Our first 
proof uses the so-called interpolation determinants of M. Laurent, while 
the second one (only sketched here) is the original proof of Schneider. In 
both proofs, the constructions depend on a sufficiently large natural pa- 
rameter N. 


Exercise 8.1. Show that the statement of Theorem 8.1 is equivalent to the 
following: If a,,a 2 are nonzero algebraic numbers such that the quotient 


Ina, 


i Inag 
is irrational, then y is transcendental. 


This is the form, in which Theorem 8.1 was proven by Gelfond. 


8.1 Reduction of proof 


Inspired by Exercise 8.1, from now on we use the notation a; = a and 
a2 = wv. Define the (non-square!) matrix 


M = |lars'l|, ans = (r + sB)"(aja3)”, 


N 
where 0<7,5<2N and 0<u<K=|NmNJ,0sv<b= |], 
n 
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whose columns are indexed by pairs u,v (the total number of which is 
KL ~ N?*), while the rows are indexed by pairs r,s (and there are exactly 
4N? such pairs). 


Lemma 8.1. The rank of the matriz M is equal to KL (that is, maximal 
possible). 


The lemma implies the existence of a nonzero minor of M of maximal 
order. Choose and fix one of these nonzero minors, say A, and denote £ 
the corresponding collection of rows r, s: 

wy0<u<K, 0<v<L 
A = det llores ltmsyec OS" £0, 
Lemma 8.2. Eventually, the estimate In|A| < —N* holds. 


Lemma 8.3. If a, 8,«° are algebraic numbers, then In|A| > —iN* for all 
sufficiently large N. 


Proof of Theorem 8.1. Assuming the three numbers a, 3, a° are algebraic, 
we find the the estimates in Lemmas 8.2 and 8.3 contradictory. This shows 
the truth of Theorem 8.1. 


In order to move further, we will introduce some more notation. For 
an algebraic number a, denote by Q(q@) the algebraic extension of the field 
of rationals that contains all polynomials (and rational functions!) of a 
with rational coefficients. The notation [Q(a) : Q] is then used to denote 
the degree of algebraic a. If an algebraic field K (that is, a field whose all 
elements are algebraic numbers) can be generated by finitely many algebraic 
numbers aj,...,@m, then there is also a single generator a of it called a 
primitive element, K = Q(a); then [K : Q] = [Q(a) : Q). Ina similar way, 
the intermediate degrees [kK : Q(a)] are introduced, when K is an algebraic 
extension of Q(a). 

If P(a1,...,%m) is a polynomial, then the maximum of the absolute 
values of its coefficients is called the height and denoted H(P), while the 
sum of the absolute values of its coefficients is called the length and denoted 
L(P). The height and length of an algebraic a corresponds to the related 
characteristics of the minimal primitive polynomial for a. 


Theorem 8.2 (Liouville-type theorem; compare with Exercise 6.7). Let 
Q1,---,Qm be algebraic, K = Q(ay,...,Am), and let P(x1,...,2%m) be a 
polynomial with integral coefficients. Then either P(a,,...,Q@m) = 0 or 

m 


Peay se) 2 El] La) eee ae, 
i=1 
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Proof of Lemma 8.3. Consider the polynomial 


P(a1, 2,"3) = det ||(r + sxg)"(w{23)" lence’ = € Zlz1, 22, 23]. 
Note that P 4 0 since P(a1, a2, 3) = A 4 0 (by Lemma 8.1 and our choice 
of A). 

To apply Theorem 8.2, notice that in our case the numbers ay, Qo, 
a3 = § are fixed, so that the degrees of algebraic extensions [Kk : Q] and 
[Ek : Q(a;)] are fixed as well. Furthermore, 

2N+ 

d P<(QN.-L)-KL< — 

C8 ry — ( ) — In N’ 

2N+ 

d P<(QN.-L)-KL< —— 

8x5 _— ( ) — In N’ 
deg,, P< K-KL<N*InN 


and 
L(P) < (KL)!- (4N)®- KE md en? In(N?) et In N-In(4N) re e2N° In? N 


for all sufficiently large N. (The multiple (KL)! corresponds to the numbers 
of terms in expanding the determinant, and (4N)*"*¥ estimates the length 
of each of these terms from above.) By Theorem 8.2 we obtain 


2Ne 
In|A| = In| P(ai1, a2, B)| > ~e(w8int 42. nN + Nem) 
n 


N4 1 


for all sufficiently large N, which completes the proof of Lemma 8.3. 


8.2 Interpolation determinants 


Lemma 8.4. Let R > p > 0. Assume we have M complex numbers 
&1,..-,&u inside the disc |€| < p and M functions fi(z),..., fur(z) which 
are analytic in the disc |z| < R such that 


‘li (<8, j=1,...,M. 
file ee (312) = 3 


Denote 6 = det || fi(&;) lla<ij<u- Then the following estimate holds: 


—M(M-—1)/2 
a NE, 


lal < m(4 
p 
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Remark. The determinant 6 is called an interpolation determinant. The 
reason behind this definition is as follows. 

Given a function g(z) and numbers €,,...,&€j7, a standard interpolation 
problem is finding a polynomial F(z) of degree at most M — 1 such that 
F(&;) = g(€;) for all i = 1,...,M. Let complexify the problem by intro- 
ducing M analytic functions f,(z),..., fi¢(z) and asking to determine the 
coefficients in representation F(z) = aifi(z) +--: + am fi(z) such that 
F(&) = g(&) for all i = 1,...,M. (The particular choice f;(z) = 297! 
corresponds to the standard interpolation problem.) In order to solve the 
corresponding linear system 


aifi(&) +++: +amful(&) =9(&), t= 1,...,M, 
we need the nonvanishing of its determinant, which is exactly equal to 6. 
(Of course, the system has a unique up to a scalar solution iff 6 4 0, which 
can be found by using, for example, Cramer’s rule.) 
Proof of Lemma 8.2. Our choice of the functions and points is: 
hip?) =e aie Kk, Oe LCS bas, Wee, 


where Ina is a fixed branch of the logarithmic function. Note that M = 
KL ~ N?. Since 


Er,s| = |r+s8|<2N(1+|6)), 


take p = 2N(1+|8]) and R = ep. Furthermore, eventually we have in the 
disc |z| < R 
N? 2 
— |eMerzina| < pK, RL\|Inal < <eN 
Mfaso(2)] = [2%] < RE RHP Al < expe) <0, 


so that Lemma 8.4 is applicable with S = e% °; 


\5| = |A] < Mle5M(M-1)/2, fenNe z (ey eo ee < ao 


for all sufficiently large N. This proves Lemma 8.2. 


Proof of Lemma 8.4. Consider the auxiliary function 


F(z) = det || fi(E;2)llisij<m, 
so that F'(1) = 6. We shall demonstrate that ordzo F(z) > M(M — 1)/2. 
Note that F(z) depends on each function f;(z) linearly, that is, if 
fi=HQr FO? +C, po), then the determinant F(z) equals the sum of the corre- 
sponding determinants F“ and F®), multiplied by C, and C2, respectively 
(F® is obtained from F by putting f instead of f; on the ith column). 
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Writing each of the functions f;(z) in the form f;(z) + 2¥@(“—-)/2g,(z), 
where filz) is a polynomial of degree at most M(M — 1)/2 — 1 and 
gi(z) is analytic at the origin, it is therefore sufficient to verify the esti- 
mate ord,<o F(z) > M(M — 1)/2 for the determinant 


F(z) = det fi(E2lli<ij<m 


instead. Again, the linearity and expression of each polynomial fil) as a 
(finite) sum of monomials Cz!, allows us to reduce the verification to the 
particular case f;(z) = z'* fori =1,...,M. Then 


git tl 


F(z) = det ||€ 2" |la<ijem = - det [fF llacigcm- 
If l;, = l, for some i; ¥ %2, then the latter determinant involves equal 
rows (of indices i1,72), so that F(z) = 0 and ord,29 F(z) > M(M — 1)/2 
is automatically satisfied. Otherwise, all 1; are pairwise distinct, and the 
latter representation implies 

M(M — 1) 


ordz<9 F(z) = +--+ +l > 04+1424---4+(M-1)= 5 : 


which is the required bound. 

Thus, we have shown that the function G(z) = F(z). 27@™—-D/? is 
analytic in the disc |z| < R. In addition, 6 = F(1) = G(1), so that by the 
maximum modulus principle 


—M(M-—1)/2 
R 
) F()|e/o 


\5| = |G) <|G@) nyo = (= 


—M(M-1)/2 
< (=) -MIS™. 
p 


Lemma 8.4 is basically a generalization of the following classical result. 


Lemma 8.5 (Schwarz lemma). Let f(z) be a holomorphic map of the disc 
|z| <1 onto itself such that |f(z)| <1 and f(0) =0. Then |f(z)| < |zI. 


Proof. For the holomorphic in the unit disc function g(z) = f(z)/z, the 
maximum modulus principle implies 


92) Slob =lfl@h <1, 


which leads to the required result. 
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8.3. Rank of interpolation matrix 


Proof of Lemma 8.1. Assume on the contrary that the rank of M is strictly 
less then KL, so that the columns of the matrix are linearly dependent. The 
latter means that there exists a polynomial 


K-1L-1 


P(z,y) = >) >> Cwz"y” £0 
u=0 v=0 
such that 
P(r+s8,aja5)=0 forall0<r<2N,0<s<2N. (8.1) 


Our nearest goal is to show that the equality in (8.1) is violated for at least 
one pair r,s. 

Remark. In general, the set of zeros of a generic 2-variable polynomial 
P(x, y) is infinite and forms a 1-dimensional variety in C?. However, in our 
situation conditions (8.1) mean that the polynomial P(x, y) has ‘too many’ 
zeros along the group G ~ C, x C, in C? equipped with group operation 
(m1,71) + (m2,nz) = (Mm, +mz,n1Nz) and generators (1, a) and (8, a°). It 
is not hard to see that 


(1, a)" - (8,a°)* = (r +sB,atas) for allr,s € Z. 


This interpretation motivates considering a more general problem of 
estimating the number of zeros of a polynomial P(21,...,2%m) on the set 
which possesses a group structure in C™. There are numerous results in 
this direction in the last five decades, including famous Baker’s linear forms 
in logarithms [7, 19, 78]. 


Lemma 8.6. Let P € C[z,y], P # 0, deg, P < K and deg,P < L. 
Furthermore, assume that the set 


{fajas:0<r< Ry, 0<s< Sj} (8.2) 


has at least L distinct elements, while the number of distinct elements in 
the set 


{r+s6:0<r< Ro, 0< 5s < So} (8.3) 
is greater than (K —1)L. Then at least one of the numbers 
P(r+sB8,aja5), where0<r< Ri +R.-1,0<8<S,+S2-1, (8.4) 


1s nonzero. 
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First we demonstrate how Lemma 8.6 with the choice R; = Ry = S$; = 
Sy = N implies Lemma 8.1. 


Lemma 8.7. Under the hypothesis of Theorem 8.1, at least one of the two 
numbers a, = a and ag = a is not a root of unity. 
Proof. If both are, say a, = e?71/™ and ag = e?7**2/", then 


mina — 27iky 
nBlna = 2nika’ 


so that @ is rational, which contradicts the hypothesis. 


If a; is not a root of unity, then the elements at, O0O<k<N, of the 
set (8.2) are all distinct, so that (8.2) contains at least N > L = |N/InN| 
elements. By the irrationality of 8 all elements in (8.3) (whose number is 
N? > KL > (K~—1)L) are pairwise distinct. Therefore, Lemma 8.6 implies 
that at least one of the numbers P(r + 88, aja§), 0 < r,s < 2N —1, does 
not vanish, which contradicts (8.1) and Lemma 8.1 follows. 


It remains to show Lemma 8.6. We will use the following simple obser- 
vation. 


Lemma 8.8. Let ki, k2,...,kn be integers, 0 < ky < ko < +--+ <kyn < LD, 
and let E c C \ {0} be a certain (finite) numerical set which contains at 
least L distinct elements. Then there exist n numbers aj,...,@n € E such 
that the square matrix \la** lla<igen is not degenerate. 


First proof. Take L different numbers b),..., 62 in € and consider the Van- 
dermonde determinant 


det [OF "i<ngcn=+ J] ()-8:) 40. 
1<i<j<L 


The rows of the determinant are linearly independent; in particular, the 
rows with indices kj +1,k2+1,...,k, +1 are linearly independent. Take 
a nonzero minor spanned by the rows: it corresponds to the required non- 


degenerate square matrix. 


Second proof. We proceed by induction on n. If n = 1, then a, can be 
taken any in € C C \ {0}. 
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Assume that we have already managed a collection aj,...,@n—1 € € 
such that the determinant A = det lat lla<ijen—1 does not vanish. Con- 
sider the polynomial 


a ee 
P(z) = det|| e008 aan lek eal (8.5) 
The substitutions z = aj, 7 = 1,...,2 —1, obviously make it zero, so that 


P(z) = (2— a1) +++ (2 — Gn-1) Q(z) 


for a certain polynomial Q(z). Expanding the determinant (8.5) along 
the last column we see that P(z) = Az*» + lower degree terms; so that 
deg P = ky, < L and degQ = degP —(n—1) < L—n+1. The set 
E\ {a1,...,@n-1} contains at least L-n+1 > degQ elements, so that for 
at least one of them, say a,,, we have Q(a,) 4 0. This proves the induction 
step and completes the proof of the lemma. 


Proof of Lemma 8.6. We proceed the proof by contradiction, assuming that 
all the numbers in (8.4) are zero. 

Expand the polynomial P(, y) in powers of y, writing only those mono- 
mials y* whose coefficients are nonzero: 


P(z,y) => Qila)y*, Qi(x) #0 fori =1,...,n, 
i=l 
O<ky <ko<-++<hky < LD. 
Define the set 
E={ajas:0<r< Ri, 0<s5s< Si} CC \ {0}; 


the number of distinct elements in € is at least LD by the hypothesis. In 
accordance with Lemma 8.8 choose an n-element subset £ = {(r,s)} CE 
such that 


B = det ||(aas)* 


i=1,...,.n; (r,s)EL a 0. 
Consider the polynomials 


Prs(a,y) = Pla +r + 88, ayagy) 


= S- Qi(x +r+sB)(atos)*y™, (r,s) EL. 


i=l 
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By our assumption 


P,.s(r' + 8'8, a’ as ) =P((rt+r')+(s+s')8, af tT’ gate’) —0 


8.6 
for allO <r’ < Ro, O< 8’ < So. eo) 


Finally, define 


A(x) = det |Qi(z +r + 88)(aja3)" 


i=1,...,.n; (r,s)EL: 


Each of the polynomials Q;(x) is given in the form Q;(a) = ba” + 
lower degree terms, where b; are nonzero. Therefore, expanding the deter- 
minant A(x) we obtain A(x) = Av™: + +™» + lower degree terms, where 


peiees 


Thus, A(x) 4 0 and deg A(x) = my +--+ + mp. 
Consider now the system of n linear equations 


n 


S- Qile +r + 58)(aja3)*y* = P.s(a,y), (r,s) € L, 


i=l 


in which y* are counted as n unknowns. The determinant of the system is 
exactly A(x). Solving the system by Cramer’s rule we get 


A(z): y*¥ =A; fori=1,...,n, (8.7) 


where the determinant A; is obtained from A by replacing the ith column 
with ||P,,s(2, ¥)Il(r,s)ec. Substituting 2 = r’ + s’B and y = at’ as, where 
r’=0,1,...,Re—-lands’ = 0,1,...,.S 9, makes the latter column vanishing, 
so that the minors A; in (8.7) vanish as well: 


A(r’ +.s'B)- (at ag) =0, i=1,...,n, 
for all 0< 1’ < Ro, O< 8’ < Sp. 
This, in turn, implies that A(r’ + s’8) = 0 for allO <r’ < Ro andO < 
s’ < Sg. By the hypothesis, there are more than ( —1)L distinct numbers 


in the set (8.3), so that the number of zeros of the polynomial A(a) must 
be greater than (A —1)L, hence deg A(x) > (K —1)LZ. On the other hand, 


deg A(x) =m, +--- +m, < (kK -1)n < (K-I)L. 


The contradiction we arrived at, shows that at least one of the numbers 
in (8.4) does not vanish. 
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8.4 Schneider’s proof of the principal theorem 


Assume on contrary that the numbers a, 6 and y = a® are algebraic. Since 
this assumption is equivalent to the algebraicity of ay = w°, 8; = 1/8 and 
1 = ale * = a, we can swap the original set with this one and proceed with 
the proof for the newer set instead. We will choose one of the two sets, 
a,8,y and ay, 61,71, and call it a, 8,7 in what follows, a set for which a 
is not a root of unity (such a choice is guaranteed by Lemma 8.7). 

As in the previous proof we choose the parameters kK = |Nln N| and 
L=|N/\nN\]. The field generated by algebraic numbers a, 3, y is denoted 
by F’, while its ring of integers is denoted by Zr. Notation [al (the house) 
from Section 6.3 stands for the maximum of the absolute values of a € F 
and all its conjugates in F’. 


Lemma 8.9. For any sufficiently large N, there exists a function 


L-1 K-1 
f@) =>) Plz’? 40, P(z)= >> Anz*, 1=0,1,...,L-1, (8.8) 
l=0 k=0 


with coefficients Ay, € Zp (not all simultaneously zero) such that 


Agere (8.9) 
and 
1 
f(r+s8)=0 forall 0<1,s<M= [5]. (8.10) 
Remark. Conditions (8.10) corresponds to the linear system 
K-1L-1 
S72 SS Ane: (7 + s8)Fal +9) =0, O< 1,8 <M, (8.11) 
k=0 1=0 


with Aj, as unknowns, 0 <1 < L—-1,0<k<K. Since al("+*9) = (atag)!, 
the matrix of the linear system is exactly the matrix M from Section 8.1 
when M = 2N. In other words, our proof of Theorem 8.1 with interpolation 
determinants used the fact that a nonzero minor of maximal order KL of the 
matrix corresponding to the system (8.11) is itself sufficiently small, and at 
the same time it is a polynomial in the numbers a, 3, y under consideration. 
It is this circumstance which underlies many other proofs by Laurent’s 
method of interpolation determinants: instead of solving a linear system, 
one investigates a nonzero minor of the corresponding matrix. 


Proof. To solve the system (8.11) we use the following result. 
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Exercise 8.2 (Siegel lemma). Show that the system of linear equations 
q 
Saiz; = 0, a=1,...,p, 
j=l 


with coefficients a;; € Zr, [aij] << A, in g > p unknowns 71,...,%, pos- 
sesses a non-trivial solution 


vj € Zr, [5 < e(eqA)?/(4-P), i) = 1, 155], 
where the positive constant c depends only on the field F’. 


Hint. Prove the statement first for the case F = Q using the pigeon hole 
principle (as in the proof of Theorem 6.3); then reduce the general case to 
this particular situation. 


In our case the number of unknowns, g = KL ~ N?, is greater than the 
number of equations, p = M? ~ N?/4. If positive integer d is such that 
da,dB,dy € Zp, then after multiplying all equations in (8.11) by d*+?4“ 
we obtain a system of linear equations with coefficients from Zr. The 
coefficients are then estimated as follows: 


Z dk +2LM (94) X|B] TaeM [eM 


4 efi N?/InN - A. 


|dK 422M (p + 5B)Falt ls 


In addition, p/(q— p) ~ 1/3 as N > oo. By the Siegel lemma there exists 
a non-trivial solution of (8.11) in unknowns Aj, € Zr such that 


Ae lee e(eN2eN*/aN)1/3 = ec2N*/InN < eN?/VvinN 


for any sufficiently large N. This proves Lemma 8.9. 


From Lemma 8.6 established above, with the same choice Ry = Ry = 
S, = So = N, it follows that there exists a pair ro, 59, 0 < 19,89 < 2N, 
such that f(ro + 598) 4 0 for the function f(z) constructed in Lemma 8.9. 
Denote 6 = f(ro + 803) 4 0; we will estimate the number from above and 
from below. 

An estimate of 6 from above. Consider the function 


= f(z) 
aS Ho<rs<mu(Z =r = 8B) 


By (8.10) it is analytic on C. We have |ro + s03| < 2N(1+ ||). Define 
the radii p = 2N(1+ ||) and R = 5p, and apply the maximum modulus 
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principle: 
16] = |f(r0 + 808)| < |glp (z-r—s) 
O0<r,s<M p 
<|gir- II (z-—r—s8) 
O<r,s<M p 
zy —-7T— sp 
< flr: ——__——|, (8.12) 
O<r,s<M are aes 5B 
where the points 21, |z1| = p, and z2, |z2| = R, are taken in such a way 


that the maximum moduli of [])<, ,e.7(2 — 7 — 88) on the circles |z| = p 
and |z| = R are attained at them. Then 

la —r—s6| <|al+|r+ 86] < 20, 

|z2—r — 86| > |z2|—|r + 86] 2 R—p, 
therefore, the estimate in (8.12) can be continued as follows: 


ap 
ll <Ifle- (go> 


O0<rse<M; 


2 
Feeds 
Now, using definition (8.8) of the function f(z) and the estimates (8.9), we 
get 

fips KL: eN?/VinN . RE. eh R\lnal < e2N?/VinN 
implying 

\5| < e2N?/vinN . aN? In 2)/4 < gat fi0 (8.13) 
for all sufficiently large N. 
An estimate of 6 from below. Write 6 as 6 = Q(a, 8, y), where 


Q(x, y, 2) — S- S Aix(o + soy)*a'?? z!so (= Ze|x, y, Z]. 


k ol 
By Liouville’s theorem the estimate 
y 


In|Q(a, B,7)| > —e(deg Q + In[Q) 
holds, where the constant c > 0 depends only on the field F' and iQ| denotes 
the sum of the houses of the coefficients of Q. In our situation 


3N? 
< < — 
degQ < 3LM < nN’ 
N? IN2 
In|Q|< + Kln(M(1+ < : 
IQi< 7 (M(1 +18) < 7— 
so that 
2 
[4] = |Q(@, B,y)| = eS VIN (8.14) 


for all sufficiently large N. 


The estimates (8.13) and (8.14) contradict each other. Hence our as- 
sumption about the simultaneous algebraicity of a, 6 and y = a? is false. 
This completes the proof of Theorem 8.1. 
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Chapter notes 


The resolutions of Hilbert’s seventh problem were published independently 
(and almost simultaneously) in 1934. In spite of the similarity of Gel- 
fond’s and Schneider’s methods, constructions of the auxiliary function 
were quite different (Lemma 8.9 highlights Schneider’s choice), and this 
difference played a crucial role in the later development of the theory of 
transcendental numbers. For example, Schneider’s version was used by 
Schneider himself [70] to prove results about the transcendence of values 
of elliptic functions and elliptic modular functions; in a general form the 
results are known as the Schneider—Lang theorem [53]. The development of 
Gelfond’s method culminated in what is called Baker’s theorem — effective 
lower bounds for the absolute value of linear combinations of logarithms of 
algebraic numbers [7, 19, 78]. 

Laurent’s method of interpolation determinants is considerably young 
[54,55] but demonstrates a significant power in applications to transcenden- 
tal numbers. One of its outcomes is sharp bounds for linear forms in two 
logarithms (of algebraic numbers) [56], which is of particular importance 
for applications to diophantine equations. 
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Chapter 9 


Schinzel—Zassenhaus conjecture 


The aim of this chapter is to outline a remarkable proof of the Schinzel— 
Zassenhaus conjecture given at the end of 2019 by V. Dimitrov in [26]. 


Theorem 9.1 (Dimitrov [26]). For an algebraic integer a of degree d, not a 
root of unity, its house — the maximum modulus of its conjugates (including 
a itself) — satisfies fal > 2/44, 


This indeed answers the 1965 suspicion of Schinzel and Zassenhaus in 
[69] about the bound fal > 1+ c/d for some absolute constant c > 0. 
Theorem 9.1 allows one to take c = (log2)/4. The earlier recorded partial 
resolutions of the Schinzel—Zassenhaus conjecture all appealed to related 
resolutions of Lehmer’s problem [18]. 

Dimitrov’s proof is based on the following ingredients given in Proposi- 
tions 9.1, 9.2, 9.3 and 9.4 below. 


Proposition 9.1 (Dimitrov [26]). For an algebraic integer a, denote by 
P(x) = iipamee: —aj,) € Z[x] its minimal (monic!) polynomial. Introduce 
additionally the polynomials 


d d 
P2(x) = [[(@-45) € Zz] and Py(x) = |] (w- o%) € Za, 
j=l j=l 
and assume that P2(a) is irreducible over Z. Then 
F(z) =f Pole) Pa(z)/2?4 € 1+ 272], 
and f(z) is rational if and only if P(x) is cyclotomic. 


Notice that the statement translates cyclotomicity of P(x) into a ratio- 
nality criterion for f(z); it sieves out those a that are roots of unity. 
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Proposition 9.2 (Pélya [62]). For a compact K in C, let K be a simply 
connected compact containing K. Assume that the function f(z) is analytic 
on C\K (that is, on a connected component of the complement of K which 
contains co) and possesses the expansion 


at oo. Define the Hankel determinants A, = deto<jecn(aj+e) forn = 
1,2,.... Then 
limsup|An|/" < t(K), 


noo 
the transfinite diameter of K. 


The transfinite diameter t(K) of a compact K C C, also known as the 
(logarithmic) capacity of kK or the Chebyshev—Fekete constant of K, is 
defined in Section 9.2. 

The next statement is known as Kronecker’s rationality criterion. 


Proposition 9.3 (Kronecker [50, pp. 566-567]). Let f(x) = 77° 9 anv” € 
C[[x]] be a formal power series. Then f(x) is a quotient of two polynomials 
(in other words, represents a rational function) if and only if A, = 0 for 
alln > n1, where Ay, = deto<j,ccn(aj+e)- 


The following result is a consequence of Dubinin’s solution of a problem 
of Gonchar. In order to state it, define a hedgehog with vertices (,,..., Ga € 
C*, notation K(61,...,8a) C C, to be the union of the d closed radial 
segments [0, 3;] joining the origin 0 to the points 8; in the complex plane, 
for 7 = 1,...,d. Note that a hedgehog K is already simply connected, so 
that Proposition 9.2 applies to K=K. 


Proposition 9.4 (Dubinin [27]). The hedgehog K = K(f1,...,B8a) CC 
has transfinite diameter t() at most 


1 1/d 
d _ y-l/d 

= =4 il. 
( wes [95 ) max, 185 
Proof of Theorem 9.1. Throughout the proof we assume that a is not a 
root of unity, so that lal > 1. 

We proceed by induction on degree d; the estimate [al > 2 > 2!/4 is 

clearly true when d = 1, and we assume that the theorem is shown for all 
algebraic integers of degree less than given d > 1. 
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If P2(2x) is reducible, then a? has degree d/2, so that it satisfies la?| > 
21/(24) by the induction hypothesis. Since [a2] = lal’, we get the desired 
inequality. Therefore, we can assume that P (a) is irreducible, hence by 
Proposition 9.1 the function 


f(z) = / Pa(2)Pa(2)/24 =14 0S 
k=1 


has all coefficients at oo integral and is irrational. Then by Proposition 9.3 
infinitely many of the Hankel determinants A, = deto<ijen(ai+;) € Z do 
not vanish; in particular, all those satisfy |A,| > 1. By Proposition 9.2 
this means that t(K) > 1, where K is the hedgehog spanned by a?, at 
and all their conjugates; in particular, Proposition 9.4 implies that t(K) < 
4-1/24)ray, Combining the two estimates implies fal’ > 41/24 = 21/4 and 
leads to the inequality claimed. 


In the remaining part we prove Propositions 9.1—9.3 and give some in- 
tuition behind Proposition 9.4; each section takes care of the corresponding 
proposition. 


9.1 Dimitrov’s cyclotomicity criterion 


Notably, Fermat’s little theorem a? = a (mod p) for all a € Z and primes p 
generalises to Euler’s congruence 


a?” =a?" (mod p"), where r = 1,2,..., 


and further to the Gauss congruence 
Yu(F)at= (mod m), (9.1) 
d|m 


where ju(-) is the Mobius function, valid for all positive integers m; see 
(72, 83]. The validity for m = p” = 2? can be performed by hand: if a is 
even then both a* and a? are divisible by 4; if a = 2k + 1 then 


a* — a? = (2k+1)* — Qk +1)? = 16k* + 32k° + 20k? + 4k = 0 (mod 4). 


Exercise 9.1. Given a € Z, prove the Gauss congruence (9.1) for any m = 
i eee 


Exercise 9.2. Let {a@m}m>1 be a sequence of integers. Then the following 
two conditions are equivalent: 
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(i) for every m= 1,2,..., 


554() 00 om m 


d\|m 
that is, the sequence satisfies Gauss congruences; and 
(ii) for all n,s > 1 and all primes p, we have apsy = @ps-1,, (mod p*). 


Lemma 9.1. For a monic polynomial P(x) € Za], we have the congruence 
P(x) = P2(x) (mod 4), 


where the congruence is understood as the congruence of the corresponding 
individual coefficients of polynomials. 


Proof. Every symmetric function on the zeros a1,...,aq of P(x) = «4 — 
e,¢7-! +egn4-2 +... +(—1)4eqg can be written as a polynomial in the sym- 
metric functions €),€2,...,e€q. Such representations for the sums of powers 
of the zeros, sp = pee) at , are known as the Newton-Girard identities; 
explicitly, we have (easily derivable!) 


2 3 
Si =€1, S2=e,—2e€2, $3 =e} — 3e1€2 + 3e3, (9.2) 


84 = et Aefes + 4e,e3 4 2e3 4eq, 
and so on. The coefficients of these polynomials are always integral. 
Now, since €1,€2,€3,e4 € Z and et = e? (mod 4) (by the above), e2 = 
—eg (mod 2), we deduce from (9.2) (from the expressions for sy and 84 
only) that s4 = s2 (mod 4), that is, 


er(at,...,a4) = e1(a7,...,a%) (mod 4). 


By replacing the original system of zeros with {a;,---a;, : 1 < ji < 
+++ << jp < d}, hence the original polynomial with the corresponding one 
(of possibly higher degree!), still monic and with integral coefficients, and 
applying the same argument we deduce that 


en (Qt,...,a4) = ex(a?,..., 0%) (mod 4) 


for k = 2,3,...,d as well. 


Proof of Proposition 9.1. Observe that 


n 


=14+2 So(-1)"10, 1X" C1 ANZ, (9.3) 
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where the integrality of the Catalan numbers C,, = (7”)/(n + 1) has 
been discussed in Exercise 1.4. From Lemma 9.1 we have P2(x)P4(a) = 
P2(x)(P2(x)+4Q(ax)) for some polynomial Q(x) € Z[x] of degree less than d. 
Thus, 


VP. x) Pa(x (a)\/1 + 4Q(2) )/P2(x), 
and the result follows from application of (9.3) with X = Q(z)/P2(z) € 
14+ 271Z[[z-1]]. 
The rationality of f(z) would mean that 


P(x) P4(x) = [[@ - a5) (x _ ais) 


j=1 


is asquare in Z[x]. Since P2(x) is irreducible by the hypothesis, the numbers 


OF .:.,08 are paw distinct, hence each of them pairs up with some 
zero of P4(x): og = Saye o(j) for each j = 1,...,d. The mapping a is clearly a 
permutation of the indices of a1,..., Qa. tere’ the identity, 
2 2 \2 8 Perey 
5 = (05,5) = A52(5) Sr = AGk 5) fork =1,2,..., 


and using the fact that o* is the identity for some k, we conclude that 
a? = an implying that each a; is a zero of the polynomial ge 2, 


In particular, a1,...,@q are roots of unity, thus our polynomial P(x) is 
cyclotomic. 


9.2. Hankel determinants and transfinite diameter 


Recall that the Vandermonde determinant evaluation 


V(z1,---,2n) =. det Caer II (ze — 2;). 


1sj,lsn : 
1<j<l<n 


Lemma 9.2 (Fekete [29]). Let kK be a compact in C containing infinitely 
many points. Denote by M,, the maximum of the quantity |V(21,...,Zn)| 
GS 21,.--,2%n run through the set K. Then the limit 


t(K) = lim M2/(r(n—1)) 
noo 
exists. 


The limit t(A) is called the transfinite diameter of K. Observe that 
the definition implies that t(f’) < t(’) whenever we have kK C Kk’ for two 
compacts K and K’ in C. 
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Proof. We borrow the argument from [48, Problem 1.10]. Given n > 1, 
assume that the maximum M,, of |V(21,..., 2n)| is attained at G1,...,¢n € 
kK. Because 
VG, .+- Gn) 
V(Gi,---;Gn-1) 


we conclude that 


= (Cn Giyen (Ge = Caa) 


With the same argument used for ¢,, replaced with any other ¢; we find out 
that 


Mn n nm ‘ 
Gi) S [] lee— Gil = Mn, 
ipl 


equivalently, Mi! < Mie?) implying that M2/(m(n—1) is monotone 
decreasing. 


Proof of Proposition 9.2. Choose y to be a piecewise smooth closed contour 
in C\ K, which is positively oriented with respect to co. The function f(z) 
is analytic within the exterior of 7, so that Cauchy integral formula applies 
and we obtain 


1 
ap = am | See tae 
for the coefficients of the expansion of f(z) at infinity. Therefore, 


An = 


ll | 
a m 
io) 
aa 
_— — 
3 ro) 
ae: 
Ye 
ine SS 
3 
Pe | 
ww, 
ey 
wo co 
“—" 
—N 
aos 
Ss oe 
pa z 
2 8 
& x 
ie 
a 
& 
— 


j=l 
_ -1 é-1 
-/-{ TI fee, (27) TL 4) 423 
j=l Ja} 
=f f I[4° VG. en) [] ft) 42; 
Y" j=1 J=1 


Now using 


S- sen(c) T200) Seder (2, = V Cigaeaee) 


1sj,fsn 
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and 
V (Ze(a)s +++) Zo(n)) = sen(o) V(21,--.,2n) foroe Gn, 


and averaging the resulting n-tuple integral for A, over all substitutions 
of the symmetric group G,, we deduce that 


Ana a foo f Veer venl TL 1) dss 


j=l 
so that 
M(y)L(y))” 
|An| < EEO - max |V(z1,-.-,2n)?? 
n! Z1 yee Zn EY 

where M(y) is the maximum of |f(z)| on y and L(y) is the length of +. 
Since this is valid for any contour y enclosing the compact K, we can record 
the resulting inequality in the form 


|An| < — + max | Vigise igen 
Te 21,...,2n€K 


for some positive constant C' independent of n. Raising both sides to the 
power 1/n?, taking the limit superior as n > oo and applying Lemma 9.2, 
the statement of Proposition 9.2 follows. 


9.3. Kronecker’s rationality criterion 


Proof of Proposition 9.3. First notice that a power series f(x) represents 
a rational function if and only the sequence of its coefficients satisfies a 
recurrence relation 


Codn + C1Gn4i+°+++Cm@ntm =9 for alln > no 


with constant coefficients co,c1,...,¢m. If such a relation is avail- 
able and n > no + ™ is arbitrary, then the columns starting with 
Ang; Ang +1,-++;4no+m in the determinant A,, are linearly dependent, hence 
A, = 0 for all such n. 

We are left to show that A, = 0 for all n > n; implies a recurrence 
relation for a, with constant coefficients. Choose m to be such that A, 4 0 
while A, = 0 for all n > m. The former condition implies that the first 
m columns of the matrix for A;,+41 are linearly independent. On the other 
hand, A,,i1 = 0 means that the last column of the determinant A,,4, is a 
linear combination of all previous ones: 


Coan + C1Gn41 +°+++Cm—-14ntm—-1+G@nim =9 forn=0,1,...,m. 
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We will show that the equality holds true for all n, in other words, that 
b, = 0 for all n > 0 where 


bn = CoGn + C1An41 +++: + Cm 14n+m 1+ @n+m- 


For n > m, assume that bp = 6; =--- = bn_1 = 0 is already shown, write 
ado see Oya Am Sa An 
Am— ses AQmn—2 | : 
Anii = det asl se ae ae calles wooo nono eon n---- 
Am soe soe An+tm 
an An+tm a2n 


and add to each column in the right part the linear combination of m pre- 


ceding columns with the corresponding coefficients cg, c1,...,C€m—1- Then 
ao soe Am—-1 bo as: bn—m 
ei ee. A¥m—-2: : : 
DAs gea ee Aet? | 20s aes ee mAs Va ae ee 
Gy souk, hes bn 
an bn. ata Conan 


= (—1)""™A,,, : ime oman 


because all the entries above the anti-subdiagonal b, ... bn, vanish. Using 
now A,y,+1 = 0 and A,, 4 0, we conclude that b, = 0 as required. 


The next result is a variation of the rationality criterion from Proposi- 
tion 9.3 also established by Kronecker. 


Exercise 9.3. With a formal power series f(x) = 37° 9 Gnx” € C[[2]] asso- 


ciate general m x m Hankel determinants 


H, = det (Anij+2). 
nym re nt+i+e) 


(a) Show that f(«) is a rational function if and only if Hy m = 0 for some 
mand all n> ny. 
(b) For n,m = 1,2,..., prove the identity 


2 
An —1,mHn+1,m == eee a deere ent = ia hacer 
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9.4 Transfinite diameter of a hedgehog 


For a compact K C C, its nth Chebyshev polynomial T,,(z) is a degree n 
monic polynomial which minimises the sup-norm 


Il fll = sup | f(2)| 
z€Kk 


over all degree n monic polynomials. A simple argument (see [24, p. 208]) 
shows that the Chebyshev polynomial T,,(z) is unique. As in Lemma 9.2, 


denote M,, to be the maximum of |V(z1,...,2n)| over all z1,...,2, in K 
and assume that it is attained at (1,...,¢n € K. Since 
MY Sen? ce oe WG 
2 2 


e n—2 n—1 


V(G1,---;¢n) = det 


Ties (see n-2 nl 
Lis ae Wye EG) 
Ae Gee a EPP EG) 


= det : x : . ’ 


1 Gy vee OR Tht (Gn) 
expanding along the last column leads to 
Mn S |Tn—-1(1)| -|V(C2,-- +3 Gn) + 22° + [Tn—1 (Gn) [V(r --- Gn—1) | 
< n||Tr—1||K ; Mn-1- 


This means that 
n-1 


M, <n TT] lIZllx, 


j=1 


so that if limsup,,_,., ||Tn||1/”" = t*(K) (essentially the Chebyshev constant 
of the compact kK), then 


M,, < nl. Croley (Gorrie) =n. OPP ee 
for some C = C(K) independent of n, implying t(K) < t*(K). 
Furthermore, for the interval K = [a,b] C R, it is known that 


b— n-1 
Il Z| [2,6] < (*) forn = 1,2,... 


(see, for example, [48, Problem 15.9]). 
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Finally, for a hedgehog K = K(6,,..., Ga) with 8; = Be?79/4, where 
8B > 0, we have 


||Pll~x = sup |P(z)| = sup |P(¢°z)| for LEZ 
z€Kk zEek 


implying 


= sup |P(z*)|= sup |P(a)]. 
z4€[0, 82] x€ [0,82] 


d-1 

d £ 
| Pllik sup IP z) 
Given n = 1,2,..., the latter supremum is minimised by the monic poly- 
nomial T;,(z) of degree n provided th Tn (C¢z) = Ta(z*), where T(x) is 
the nth Chebyshev polynomial on the interval [0, 8¢]. Therefore, 

d 
t (K(B, Berrsd _ pera)? = t*([0, 87)) a . 
so ‘that ¢@*(K(8,6e7"/4,... ., Ber™4e-O/4)) — a-tidg: 

For setting up some evidence towards Proposition 9.4, we first enlarge 
all the prickles to [0, 6%] D [0,8;] of equal length |6;| = --- = [84] = 
max1<;<a|@;|. Since K = K(f1,...,8a) C K(6i,-.., 64) = K’, we have 
t(K) < t(K’) by the property of transfinite diameter. Therefore, it is 
sufficient to prove the statement for the case |G;| = --- = |Ga| = 6. Geo- 
metrically, the maximal possible value for all such configurations is achieved 
when the prickles are equidistributed around the origin, and in this case 
we have t(K(1,...,a)) < t*(K(f1,--.,8a)) = 471/48 by the calculation 
above. 


Chapter notes 


The conjecture of Schinzel and Zassenhaus [69] was always in a shadow 
of Lehmer’s question [18], about the infimum of the Mahler measure of a 
monic (non-cyclotomic) P(a) = TI: (@-a;) € Z{x] (or of its zero a = ay), 
d 
M(a) = M(P(2)) = [J max{1, |a;]}. 
j=l 

All known lower bounds for [a] were coming from those for M(qa), and it 
was not even clear that a separate treatment of the former is possible. This 
makes Dimitrov’s proof exceeding all expectations. 

There is one more equivalent condition (iii) that can be included in 


Exercise 9.2: 
iss aaa” 
exp ( ie a ) € Z[[a]]. 


m=1 
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Its proof requires special tools (combinatorial, arithmetic or p-adic), which 
we do not touch here; the reader is advised to consult with the solution of 
Exercise 5.2 in [73, Chapter 5] for this. 

The toughest ingredient of the proof is Dubinin’s result (Proposi- 
tion 9.4), for which a simple argument is not known. Some related dis- 
cussions in this direction can be found in [46]. 
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Chapter 10 


Creative microscoping 


In this chapter we return to the theme started in Chapter 1—the q- 
deformation — with the motive to apply it analytically to proving congru- 
ences for integer and rational numbers. Such congruences clearly belong 
to arithmetic, so that we indeed witness another use of analysis in number 
theory. 

We already know what q-numbers are and what q-factorials and q- 
binomials are. But we have not seen a q-version of the binomial theo- 
rem (1.2). 

To feel ourselves comfortable about the material in this chapter we need 
to introduce relevant notation, which is in line with one from Chapters 3 
and 7. The variable q will be treated either as a formal parameter or as 
a complex number inside the unit disk. For m = 0,1,..., define first the 
q-shifted factorial 

m 
(a;4)m = [[ - aq’), 
j=l 
also known as the g-Pochhammer symbol. It is not straightforward to 
observe its similarity with the Pochhammer symbol (3.15); in fact, 


m 


2 AGP aE aie 3 Sea = 2 
Pigeon ae Mee 


When |g| < 1, the g-Pochhammer symbol makes perfect sense even if 
m = co (something inaccessible to the usual Pochhammer symbol!). 


Exercise 10.1 (q-binomial theorem). (a) Prove that for n = 0,1,2,..., 


Gis o ra ane em (10.1) 
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(b) Verify that the limiting case of (10.1) as q > 1 is (l-— 2)” = 
eer esa (—1)™"a™, which is an equivalent form of the binomial theo- 


rem (1.2). 


In Section 10.3 we review further examples of such identities. 


10.1 Supercongruences for binomial coefficients 


Wilson’s theorem (see Exercise 5.3) implies that for a prime p and a positive 
integer a we have yD = 1 (mod p), which can be stated equivalently as 


(2) =() in 


It turns out that there is a much finer version of this result, which we discuss 
below. 

The following congruence is usually attributed to Ljunggren (1952) or 
to Kazandzidis (1968), though it is essentially equivalent to its particular 
instance a = 2, b = 1 shown much earlier by Wolstenholme (1862). 


Theorem 10.1. Take a > b> 0 integers. Then for primes p > 5, 


Cs) = (G) twoas?s 


The term supercongruence is coined by Stienstra and Beukers to a con- 
gruence like in Theorem 10.1 when there is an ‘unexpectedly’ high power 
of p modulo which it takes place. At the same time the congruence has a 
relatively simple (or elementary) proof modulo p. 

Instead of showing the Wolstenholme—Ljunggren—Kazandzidis supercon- 
gruence we will prove its g-deformed version. This is settled recently by 
Straub [74]. 


Theorem 10.2. Take a > b> 0 integers. Then for integers n > 0, 
an a a\n?—1,, 
=|*} oa —0)(%)=—* "= 1)? (mod &,(g)), (10.2) 
bn] , D] gn? b}/ 24 
where ®,(q) denotes the nth cyclotomic polynomial. 


Modulo ©,,(q)? rather than ®,,(q)? one can write down simpler versions, 
for example 


a ob g(?) = @ 4 (“- :) o%q(2) (mod ®,(g)2), (10.3) 
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where 
6.1) 


The original proof given in [74] is combinatorial; here we follow a dif- 
ferent route. The congruence in (10.2) is in fact a g-congruence, so that we 
have to clarify its meaning. A congruence A;(q) = A2(q) (mod P(gq)) for ra- 
tional functions A1(q), A2(q) of parameters g and a polynomial P(q) € Z[q] 
is understood as follows: the polynomial P(q) is relatively prime with the 
denominators of A;(q) and A2(q), and P(q) divides the numerator A(q) of 
the difference A;(q) — A2(q). The latter is equivalent to the condition that 
for each zero a € C of P(q) of multiplicity k, the polynomial (q— a)’ di- 
vides A(q) in C[q]; in other words, Ai(q) — A2(q) = O((q—a)*) as qa. 
This latter— purely analytic—interpretation underlies our argument in 
establishing g-congruences. For example, showing the congruence (10.3) is 
equivalent to verifying that 


nl," — (2) = ) 7 es saat —6)(2) + 0(6?) ase 0, 
(10.4) 
when g = ¢(1 —«) and ¢ is any primitive nth root of unity. 
How does Theorem 10.2 imply Theorem 10.1? The congruence (10.2) 
means that 


(i). -t-0() ae) fan 


for some polynomial B(q) with integer coefficients. Choosing n = p > 3 in 
this equality and then letting qg > 1 result in 


1 
(7) - @ _ 5 Bor” for some Bo € Z, 


so that Theorem 10.1 follows. Also notice that (10.3) simplifies to [onl g = 


(3 ae modulo ®,,(q)? (the additional term drops!), hence the above argu- 
ment reduces the resulting congruence to 


a) = (5) (mod p”) for all primes p, 


the result first shown by Babbage (1819) for a = 2, b = 1 and preceding 
Wolstenholme’s theorem. 
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Lemma 10.1. Let ¢ be a primitive nth root of unity. Then, as q = 
¢(1-—¢€) > ¢ radially, 


in} 2802) — (5) — (87g anal 


= —b(a (3) cere) a =a O(e?). (10.5) 


Proof. It follows from the qg-binomial theorem (10.1) with n replaced by an 
that 


1 n : an a a a 
a Io: = _ ym qm(m—1)/2 _ __,\bn ,bn(bn—1)/2 
= > ((72;9)an = > at «)"q > ral «yg 


(10.6) 
When gq = ¢(1—«), we get d/de = —¢ (d/dg). If 


F(a) = (2:d)an_ and g(a) = Fog fla) =— > TE, 


then f(g)leo = (1—@”)* and 


df 


dq 1% 


In particular, 


df) ma we l6ta 

al ee es ) dj Coa 
and 

d?f 7 sts an-1 tc ox 2 

ze [7-2 ( res) 


ss Sic er) 
2.\a-Cap* 1c) ) 
Further observe the following summation formulae: 


n 
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and 
nm 


5 av Cha 
n l-a1—(¢Fe 


j=l 


gh 


= Se for k £0 (mod n). 


ain 


Implementing this information into (10.6) we obtain 


. an x 
(aggre = (1 _ a”) l+te 7 e 
x A sees l-« a 


62 en an—1 2 62 nx an—1 
me C\ qe Qe 
poeo(S) +o YL 44 
l=1 6; ,0o=1 
£,=£2 (mod n) 


e2 nan x an—-1 e2 a an—1 
?— — — le-1 O(e?). 
(ce 1-2” > Dine Ly Co Oe) 
Finally, compare the coefficients of powers of x” on both sides of the relation 
obtained; this way we arrive at the asymptotics in (10.5). 


To prove Theorem 10.2 we need to produce a ‘matching’ asymptotic for 


el 


This happens to be easier than what we have done in Lemma 10.1, because 


q” = (1 -e)” does not depend on the choice of primitive nth root of unity 
¢ when g = ¢(1—e). 


Lemma 10.2. As qg=¢(1—¢) > ¢ radially, 


A ama) (5 )- (zp otal 


= b(a b) (5) (3(an — 1) Sat 1)n )n e2 O(e?). 
Proof. From (10.1) we conclude that 
ord saJo= So [F) ok(—ayhral. 
b=0 a 
Then, for q = ¢(1 — ¢€), we write y = 0,2” to obtain 
(2"q’?);q™ Ja = (y(t 2)@); ©)" )a = T] (1- yt - )*@)) 
&=0 
a-1 én? +(3) 
hehe ee fn? + (3)\ (_ yi 
eye 


It remains to compare the coefficients of x” on both sides. 
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Proof of Theorem 10.2. Note that ¢ = 4(1—q”")+O(e?) asq = C(1-e) + ¢ 
radially, where ¢ is primitive nth root of unity. Combining the expansions 
in Lemmas 10.1, 10.2 we find out that 


ei) sey) oa) 
| an A qr? ond 


a\ (an? n? — (a n 
ae (9): 2) 5] ah UR 2) OG) 
= —0(a (3) te a e? + O(e?) 


a\ n?—1 
=—b b @ — 1)? + O(e%). 
(- (5) - 0? 400%) 
This means that the difference of both sides is divisible by (q — ¢)° for 
any nth primitive root of unity ¢, hence by ®,,(q)?. The latter property is 
equivalent to the congruence (10.2). 


10.2 Ramanujan’s formulae for 1/7 


Srinivasa Ramanujan (1887-1920) was an Indian mathematician whose 
mathematical contributions had a lasting impact on the development of 
number theory and special functions. Many notions and theorems origi- 
nated from his papers, letters and notebooks; the account of his work and 
its implications can be found in [2, 10, 11,60]. 

In his development of the theory of elliptic functions, Ramanujan came 
up [65] with computationally efficient representations of 1/7. Examples are 


3 rn (1 + 6n) = = - (10.7) 

y iints late (4 + 33n) = = sate (10.8) 

pS taste) (84 isan (4)" : ee) aes 
Se (1123 + 21460n) (-sz) = = feds 
3 ema (1103 + 26390n) _ = S (10.11) 
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where (s)n = []j1(s + j — 1) is the Pochhammer symbol (3.15); these are 
equations (28), (32), (34), (89) and (44) on the list in [65]. Ramanujan 
did not hide his interest in computing 7; his comment in [65] about iden- 
tity (10.11) says “The last series (44) is extremely rapidly convergent.” In 
total, Ramanujan gave seventeen such equalities. 

The identities do not look hard. In spite of this, their first proofs 
were only obtained in the 1980s by the Borweins and independently by the 
Chudnovskys. A historical account of contemporary techniques for prov- 
ing Ramanujan’s (and Ramanujan-type) formulae for 1/7 can be found 
in [8,88]. The dominating method which, for example, works for any for- 
mulae in (10.7)—(10.11) is based on modular-function parametrisations of 
the underlying series. This modular technique cannot be counted as el- 
ementary, but it leads to many further examples (though not necessarily 
computationally useful) like the formula 


Ym (20n-+ 1038) (VEER) _ AS HV (10.12) 
n=0 


2 6a 


of Ramanujan type, involving the Apéry numbers 
n n+ k 2 a 2 
= 10.1 
m= C0) Ge ins 
k=0 
from Section 7.1; this identity was discovered by T. Sato in 2002. 


One formula, which is not on Ramanujan’s list in [65] but clearly belongs 
to it, is 


+o (3) eee 
dX a Mit an) = (10.14) 
In fact, this identity was proven by Bauer (1859) long before Ramanujan 
was born, using a quite elementary argument. The convergence in (10.14) 
is poor and comparable with Leibniz’s formula (1.11) (though the latter is 
for 7 itself). Nevertheless the shape of the formula is very much the same 
as in (10.7)—(10.11), with the sums on left-hand sides are linked with some 
particular instances m = 3 of the (generalized) hypergeometric series 


1, 42, ..., Am a = (41) n(G2)n oiaay (Gm)n zn” 
mFn-a( Bice te be :) = Cha eee ) 


Namely, the identities listed all involve linear combinations of 3 F series and 
its derivative at a (rational) point, with a, = 4, {3, 3 ‘, i 
and by = bg = 1. The series defining »,fn—-1(z) converges in the unit disk 


n=0 


az =1 a3 
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|z| <1. It also satisfies a linear homogeneous differential equation of order 
m with coefficients in C(z); this differential equation allows one to continue 
the function analytically to C \ [1,+00). Good sources for the theory of 
generalized hypergeometric series are books [6,82] (see also [71]). Though 
(10.12) does not belongs to this hypergeometric family of Ramanujan’s 
formulae, the generating series Saar Unz” satisfies a third order linear 
differential equation (given in an equivalent form in Exercise 7.2) which 
shares many similarities with those satisfied by 3F2(z); this places (10.12) 
on the list of Ramanujan-type formulae. 

The following exercise illustrates another technique which can be used 
for proving some identities of Ramanujan type. It relies on the method of 
creative telescoping which we have already seen in action in Exercise 7.2. 


Exercise 10.2 (Zeilberger [28]). Define 


Q2(-bn n QP +k) 
F(n,k) = na +hn (1+ 4n) (-1)”- ne +k) 
and take 
G(n,k) = Cae F(n,k). 


(2n + 2k +3)(4n +1) 
(a) Show that for n = 0,1,2,... andk=0,1,2..., 

F(n,k +1) - F(n,k) = G(n,k) — G(n -1,k). 
(b) Use part (a) to prove that 


S° F(n,k) = S> F(n,k) 
n=0 


neZ 
does not depend on &. Then show that this constant is 1 (computing 
the sum, for example, at k = 0). 
(c) Conclude that 


(3 )n(=k)n se, 2 tok) 
» aPeew, oto Pe (10.16) 


n=0 


Hint. (a) Divide both sides by F'(n, k) to reduce verification to one of an 
identity for simple rational functions in n and k. 


Though equality (10.16) is only shown to be true for & = 0,1,2,..., it 
remains true for k € C with Rek > —1—this is a consequence of Carlson’s 
theorem (see, for example, [6, Section 5.3]), another classical analysis result. 
Finally, notice that Bauer’s identity (10.14) is the case k = —1/2 of (10.16). 
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Another elementary technique for producing new Ramanujan-type iden- 
tities from already known ones is known as the translation method [23,42]. 
It sources from algebraic identities of hypergeometric series and relies on 
manipulations that use calculus rules. It is illustrated in the following ex- 
ercise. 


Exercise 10.3. (a) Show Bailey’s cubic transformation 


- (5)3 n —1/2 _ (5)n(4)n(2)n 27x i‘: 
oy aye" = (1-42) p om ( a) 


for x from a neighbourhood of the origin. 
(b) Using the identity from part (a) and its x-derivative at « = —1, show 


that 
oo 71) /1\ /5 3n = 
S (5) nlg)n(@)n (34 28n) (2) - 5V5 


n!3 T 
n=0 5 


This formula was not given by Ramanujan in [65]. 


Hint. (a) Verify that both sides satisfy the same linear differential equation 
of order 3. 

(b) Apply the operator Id +42 to both sides of identity from part (a), 
then substitute « = —1 and use the known formula (10.14) for the left-hand 
side. 


The next example is an advanced version of Exercise 10.3. 


Exercise 10.4. Let uy be the sequence of Apéry numbers defined in (10.13). 


(a) Show that for sufficiently small |z|, 


Sa (1 — 8x)” _3j2 > (a)a (_ 640(1 + 2)°\” 
2H (L+a)ntt ese! » nis (1 — 8x)3 
(b) Use the transformation from part (a) at 2 = (96 — 22)/4 and (10.14) 
to prove 
= 1 
S 7 (4- V6 + 8n)un(V3 — V2)rt? = —.. 
n=0 my/2 


In 1997 Van Hamme noticed that several formulae of Ramanujan for in- 
finite sums possess arithmetic finite-sum analogues. The example relevant 
to our discussion in this section and corresponding to Ramanujan-type for- 
mula (10.14) is the family of congruences 

 (3)3 1 
ys aE (1+ 4n)(-1)"= (=)p (mod p?) for primes p> 2, (10.17) 


n=0 
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where (¢) denote the Legendre symbol (see Exercise 5.10). This was sub- 
sequently proven by Mortenson (2008) and several other proofs appeared 
later. It has been also realised [89] (mostly numerically!) that the pattern 
continues to hold for other Ramanujan’s and Ramanujan-type formula (at 
least when they correspond to 3F2(z) series with rational z), so that we 


have 


3 tale eo) yn s (=)p (mod p*) for p > 2, (10.18) 
and 
: (n(Dn(n (4+ 83n) 25 = ( )p (mod p*) 
for p > 3, 
s dnl Balan og 1am)" = 0( 22) (ma 9) 


forp > 5, p#17, 
— (F)n(4)n(3)n te =f 
dX 2 ae 21234 214600) ( as) = 1123( 7 )p(anoa p’) 


for p > 3, p#7, 


p-l/i 1 3 

a/n\al/n\qa)n a ~2 

3 Waa\aalale (1103 + 26390n) a= 1103( )p (mod p*) 
n} Pp 


for p > 3, p# 11, 


as p-counterparts of (10.7)—(10.11). At the moment the general congruences 
from this list are only proven for the family (10.18). 

Notice that the terms in these sums are not integers but rational num- 
bers, however with the denominators that only involve finitely many (small) 
primes which we exclude from the consideration. To see that we just need 
to note that 


2/n —6n n 2/n n n —2n n 2n 3n | 


where the factorial ratios are all integral. 
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In spite of their limited capacity, there are already several methods on 
the market designed for proving Ramanujan-type supercongruences. One 
method, which is based on ideas used in Section 10.1 and known as creative 
microscoping, makes more (when succeed): it leads to simultaneous proofs 
of Ramanujan-type identity and corresponding Ramanujan-type supercon- 
gruences, thus explaining their mysterious interconnection. In the rest of 
this chapter we illustrate the performance of creative microscoping on the 
pair (10.14), (10.17). 


10.3. q-Hypergeometry 


We have already witnessed in the proof of Lemma 10.1 a use of the q- 
binomial theorem (10.1). In fact, the latter formula comes as a particular 
case of a more general result. 


Theorem 10.3 (g-binomial theorem). When |q| < 1 and |z| < 1, 


Ss (4; 4)n sm _ (025 Goo (10.19) 


<4 (Gn (25D) 00 


This theorem is a g-extension of the general binomial formula 


(t=3)-eS > (@)n on _ oR G :) 


(see (10.15)), and this extension is a fundamental identity in the theory of q- 
hypergeometric functions: it is expected that every other g-hypergeometric 
identity can be deduced via a finite combination of equation (10.19) (of 
course, with different setup for its parameters). 


Proof. We follow the creative telescoping strategy. Denote the nth term of 
the sum in (10.19) by F,,(z) and take 


(43 9)n 2” ; 
1—q” —_—_——__  ifn>0, 
Gn = + F(z) = 4 (43 4)n—1 (2 — 1) 

os 0 ifn =0. 

We claim the telescoping relation 
1l-az 
for n = 0,1,2,...; division of both sides by F;,(z) reduces the equality to 
a simpler one, 
1- 1-—aX 1-X 
1- a ee aN where X = q”, 


1l-z z—1 z—-1 
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whose verification is straightforward. Summing both sides of the telescoping 
relation over n = 0,1,2,... results in 


S> Fa(z) - ee S_ Fu(zg) =0. 
n=0 n=0 


Iterating this equality m times leads to 


yee) = 2 Y aleg) == wate > Fala”), 
n=0 z n=0 23G)m 


n=0 


and (10.19) follows from taking the limit as m — oo in the result. 


To see that (10.1) is a special case of Theorem 10.3, replace the sum- 


n pany nm 
eS 0g 


mation index n in (10.19) by m and then take a = q~ 

One particular feature that makes the creative telescoping possible in 
the above proof but also for general (q-)hypergeometric sums }>~° 9 ¢n is 
a simple form of the quotient of two consecutive terms of the latter. This 
brings us naturally to a definition of (g-)hypergeometric series: it is one for 
which Cy41/Cp is a rational function of index n (respectively, of parameter 
q”). You may check that (10.15) is a hypergeometric series and that all the 
q-sums in this chapter are q-hypergeometric series. 

It is absolutely amazing how rich a hierarchy of g-hypergeometric iden- 
tities (summations and transformations) is. To get a good view of it one 
needs to master numerous available tools; a comprehensive source of those 
is the book [34] known among the specialists as the g-Bible. Below we limit 
ourselves to a particular g-hypergeometric summation, which is a fine rep- 
resentative of the theory and at the same time an instrument required in 
our arithmetic application. (In the g-Bible it is inelegantly called the sum- 
mation formula for a non-terminating very-well-poised ¢¢5-series; see [34, 
eq. (II.20)].) 


Theorem 10.4. When |q| <1 and |aq| < |bed], 
> (1 — aq?”) (a; q)n(0; )n(C3@)n (di @)n ( aq I 
(1 — a) (G5 9)n(aq/b; Qn (ag/c; W)n(ag/d;q)n \ bed 


_ (24; 4) 20 (a4/ (be); 9) 20(04/ (bd); 4) 0(a4/ (C4); doe (10.20) 


(aq/b; q)o0(aq/¢; Q)oo (g/d; 1) oo(ag/ (bed); doe 
Proof. Let F;,(a) denote the nth term of the sum in (10.20). Then 


(a — 1)(a ~ be)(a ~ bd) (a ~ 4) 
F,,(a/q) (a — b)(a — c)(a — d)(a — bed) 


n=0 


F(a) i Gn41 i Gn 
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for alln = 0,1,2,..., where 


_ (a2q" = bedg\(1-4") pn ¢, 
Cn = Cag ga —bedy (0: 


in particular Go = 0. (As before, verification commences after reduction to 
a rational-function identity via division of both sides by F;,(a/q).) Replac- 
ing a with aq, summing the telescoping relation over n = 0,1,2,... and 
using G, > 0 as n > oo when |aq/(bcd)| < 1 we obtain 


1) (aq — be) (aq a )(aq — cd) 
Fi 
os Fa (aq — b)(aq — c)(aq — d)(aq — bcd) yD (a) 


(= ag)(1 — ag/tb)) (1 — ag/(6d))(1— (ed) . 
~ “(r= aq/6)(d — ag/0)(1 — ag/d)(1 — aa/(bed)) Fl 2) 


It remains to iterate the result m times and then compute the limit as 


mow. 


10.4 Supercongruences and qg-supercongruences 


Recall the notation [m] = [m], = (1— q™)/(1 — q) for the q-numbers. 


Theorem 10.5 (q-analogue of equation (10.14)). The following equality is 
true: 


2 (a) [1+ Any -(—1)rgr? = FiT oolgig es (49,21) 


Sy MO 


Theorem 10.6 (q-analogue of family (10.17)). Let m be a positive odd 
integer. Then 


= - q2)3 2 7 = 
S- ete adie (yg ag mi(=) (mod [m]®m(q)?). 
= (10.22) 


In the last theorem, the truncated q-hypergeometric sums are consid- 
ered modulo (products of) cyclotomic polynomials. Notice that [m], = 
Waterss ®a(q) and that [p], = ®p(q) > p as q > 1 when p is prime. 

In the case of formula (10.21), we see that 
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and 


' (G4? )oo ar cae: 

1m = 1, ? 

91 (975 q?)oo(1—9?)/? (5) Va 

hence in the limit as g > 1 we obtain (10.14). At the same time, taking 
the limit as g > 1 in (10.22) for m = p prime leads to the Ramanujan-type 


supercongruences (10.17). 

Our proof of Theorem 10.6 combines two principles. One corresponds 
to achieving the congruences in (10.22) modulo [m] only, and for this we 
deal with the g-hypergeometric sum (10.21) at a ‘q-microscopic’ level — that 
is, at roots of unity (and this cannot be transformed into a derivation of 
(10.17) directly from (10.14)). Another ‘creative’ principle is about getting 
more parameters involved in the q-story. 


Theorem 10.7. Let m be a positive odd integer. Then, for any indetermi- 
nates a and q, we have modulo [m](1 — aq™)(a — q™), 


ve nag; 4") n(q/a; 4" )n [1 + An] (-1)"qr” = g™—Y) “mj (=). 


n(aq?; q?)n(q?/a; G7) n m 
(10.23) 


Proof of Theorem 10.6. The denominator of (10.23) related to a is the fac- 
tor (aq?;q7)m—1(q7/a;q7)m_—1; its limit as a — 1 is relatively prime to 
®,,(q), since m is odd. On the other hand, the limit of (1 — ag™)(a — q’) 
as a —> 1 has the factor ©,,(q)?. Thus, letting a — 1 in (10.23) we see that 
(10.22) is true modulo ©,,,(q)?. At the same time, by considering (10.23) 
modulo [m] only and specialising a = 1 in the result reads 

m-1 


3 ae Jnlagi a )nla/aig*)n_ 14 4 an] (—1)"q?” =0 (mod [m)). 


gg? )n(aq?; 9? )n(q?/a; 9?) n 


Thus, indeed both sides of (10.22) are congruent modulo [m]®,,(q)?. 


In turn, the general set of congruences in Theorem 10.7 is deduced from 
a non-terminating version of (10.23). 


Theorem 10.8. The following identity is true: 


(1 — 41) (9; 47) n (093 97)n (4/0597) n ) 4\nn? 

dT Di woe oleae, N's 
<2, GPa eek es 

 (aq?; 9?) 00(q?/a5 q?)oo | eee 


n=0 
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Proof. Take d = 1/e in (10.20) and let ¢ > 0 to obtain 
> (1 — aq?) (a; @)n (0; 2) n(G @)n(—1)greV/? (2) ‘ 
ae (1 — a) (959) n(aq/b; @)n(aa/cs @)n 

_ (44; Q)o0(ag/ (be); Doo 

(aq/b; q)00(a9/C; 00 
In this identity replace q with q’?, then choose a = q, b = dq, c= q/d and 
finally replace d with a. O 


Proof of Theorem 10.5. Take a = 1 in (10.24). 


In the remainder of this section we discuss the most non-trivial part 
of the method of creative microscoping — deduction of Theorem 10.7 from 
Theorem 10.8. 


Lemma 10.3. Let m be a positive odd integer. Then 


s . (G59 )n (ge) n (git?) n fh + 4n) (—1)"q"" 


Toga ale One ne en 


= gpm] (=). (10.25) 


m 


n=0 


Proof. We substitute a = q™ into (10.24). Then the left-hand side of 
(10.24) terminates (already at n = (m — 1)/2, meaning that all its terms 
starting from (m+ 1)/2 vanish) and equals the sum in (10.25). On the 
other hand, the substitution transforms the right-hand side of (10.24) into 
(PsP )oolGiq Joo _ (WW )(m—1y/2 
(97-5? )00(G?t™5 G?)oo (G73. 9?) (m—1)/2 
(9°73. 9°)(m=1)/2 
(SL) ge (ae Ne 199 


2 
(eee gee® /4tm]. 


Proof of Theorem 10.7. Let ¢ £1 be a primitive dth root of unity, where 
d|m and m > 1 is odd (hence d is odd as well). Denote by 


7) n (aq; q?)n a; q?)n nae 
Fn(q) = Cee Aa a 1+ 4n}(—1)"q 
(97; 7) n(aq?; q?)n(q?/a; 97) n 

the nth term of the sum (10.24) and write (10.24) as 


oo d-1 2 2 
S> F(a) Fea+n(q) = (9°07 \co(G5.07 )ixs (10.26) 
0=0 


Fra(q) —— (aq?; G2 )oo (q?/45 G?)o0 | 


n=0 
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Consider the limit as gq > ¢ radially, that is, g = r¢ where r > 1~. On the 
left-hand side we get 


Featn(q) _ Fea+n(¢) 2a 
rears Fra(q)—— Fea(¢) =a) 


and 


(4; 67 )n (aq; C7?) n(q/a;C7)n 
lim Fya(q) = lim 
gy = Onan eas 
since d is odd and (a;¢?)eaq = (a;¢7)5 = (1 —a%)*. For the right-hand side 
of (10.26), 
i Cele en Cnm es 
a6 (aq; G00 (97/4; goo 


= 0, 


because the part (q; 9?) (a41)/2 of the product (q; q”)oo vanishes at q = ¢. By 
comparing the asymptotics of both sides of (10.26) as g > ¢ we conclude 
that 


d-1 

S- F,(¢) = 0; 

n=0 
this in turn implies that 
m-1 d—-1 2Qd—-1 m—1 i d-1 
n=0 n=0 n=d n=m—d n=0 


Since this is true for any choice of dth root of unity ¢, the equality can be 
stated as the congruence pan F,,(q) = 0 (mod ®q(q)). The latter is valid 
for any d| m, d > 1, hence 


m-1 


YS Fa(a) 05 "nj =) (mod fm) 


n=0 
On the other hand, it follows from Lemma 10.3 that 
X< n (aq; q oF (q/a;97)n nn? (m—1)?/4 =1 
1+4 1 — — 
ve n(aq?; @)n (ead), | n] ( ) qd qd [m] m 


when a = q or a= q~™; this implies that the congruences (10.23) hold 
true modulo 1 — ag™ and a— q™. Since the polynomials [m], 1 — ag™ and 


a—q"™ are relatively prime, we obtain (10.23) modulo their product. 
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We can summarise our derivation path of the results as follows: 


Theorem 10.8 = Theorem 10.5 re formula (10.14) 
qa 


a=1 
1 
Theorem 10.7 = > Theorem 10.6 = > congruences (10.17) 
al qol 


The top of this scheme— Theorem 10.8— comes essentially for free from a 
known q-hypergeometric identity, and many further entries from [34] lead 
to remarkable (and quite difficult!) congruences, so that the g-Bible turns 
out to be a treasury book for number theory. 


Chapter notes 


There is a modulo p* extension of Theorem 10.1, 


(i) = (6) sone —0(¢)o5E imate forme 


bp k=1 


It involves the harmonic sums 
pal 1 
ye 5 = 0 (mod p*) for prime p > 2, 
k=1 


and can be also deduced from suitable g-extensions using the method in 
Section 10.1. 

The theme of Ramanujan-type formulae for 1/7 is quite rich, we do not 
attempt at reviewing it properly; the reader is advised to follow the survey 
articles [8,88] and books [22,25] (which cover way more on the theme) for 
this. We would nevertheless mention the original approach of J. Guillera 
for proving the formulae by J. Guillera [38-41] using the powerful Wilf- 
Zeilberger (WZ) machinery; the method in its basic form is exemplified in 
Exercise 10.2. Guillera manages to prove similar-looking identities for 1/7? 
in terms of 5 F, hypergeometric series, and his method (quite elementary in 
nature!) is currently the only one which is available for such formulae. 

The method of creative microscoping originates from the paper [44]. 
The name ‘creative microscoping’ is inspired by ‘creative telescoping’ — the 
latter coined in [64] to the method which was originally used by D. Zagier 
for proving the recurrence equation in Apéry’s proof of the irrationality of 
¢(3) (see Exercise 7.2). In this chapter we have witnessed several other 
applications of creative telescoping. 

It is worth mentioning that the congruences in (10.17) and Theo- 
rems 10.6 and 10.7 remain true when the sums are truncated at (p — 1)/2 
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or (n — 1)/2, respectively; these other(!) companion congruences can also 
be settled by the method. Recent work of V. Guo, some in collaboration 
with M. Schlosser, and with others (see, for example, [43, 45]), extends 
the horizons of applicability of creative microscoping even further. One of 
the latest achievements is a general framework (of g-analogues) of so-called 
Dwork-type supercongruences. 
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